The architecture has been updated
This commit is contained in:
parent
805f7a017e
commit
a01257ead9
1119 changed files with 226 additions and 352 deletions
20
hermes_code/website/.gitignore
vendored
Normal file
20
hermes_code/website/.gitignore
vendored
Normal file
|
|
@ -0,0 +1,20 @@
|
|||
# Dependencies
|
||||
/node_modules
|
||||
|
||||
# Production
|
||||
/build
|
||||
|
||||
# Generated files
|
||||
.docusaurus
|
||||
.cache-loader
|
||||
|
||||
# Misc
|
||||
.DS_Store
|
||||
.env.local
|
||||
.env.development.local
|
||||
.env.test.local
|
||||
.env.production.local
|
||||
|
||||
npm-debug.log*
|
||||
yarn-debug.log*
|
||||
yarn-error.log*
|
||||
45
hermes_code/website/README.md
Normal file
45
hermes_code/website/README.md
Normal file
|
|
@ -0,0 +1,45 @@
|
|||
# Website
|
||||
|
||||
This website is built using [Docusaurus](https://docusaurus.io/), a modern static website generator.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
yarn
|
||||
```
|
||||
|
||||
## Local Development
|
||||
|
||||
```bash
|
||||
yarn start
|
||||
```
|
||||
|
||||
This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
|
||||
|
||||
## Build
|
||||
|
||||
```bash
|
||||
yarn build
|
||||
```
|
||||
|
||||
This command generates static content into the `build` directory and can be served using any static contents hosting service.
|
||||
|
||||
## Deployment
|
||||
|
||||
Using SSH:
|
||||
|
||||
```bash
|
||||
USE_SSH=true yarn deploy
|
||||
```
|
||||
|
||||
Not using SSH:
|
||||
|
||||
```bash
|
||||
GIT_USER=<Your GitHub username> yarn deploy
|
||||
```
|
||||
|
||||
If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.
|
||||
|
||||
## Diagram Linting
|
||||
|
||||
CI runs `ascii-guard` to lint docs for ASCII box diagrams. Use Mermaid (````mermaid`) or plain lists/tables instead of ASCII boxes to avoid CI failures.
|
||||
8
hermes_code/website/docs/developer-guide/_category_.json
Normal file
8
hermes_code/website/docs/developer-guide/_category_.json
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
{
|
||||
"label": "Developer Guide",
|
||||
"position": 3,
|
||||
"link": {
|
||||
"type": "generated-index",
|
||||
"description": "Contribute to Hermes Agent — architecture, tools, skills, and more."
|
||||
}
|
||||
}
|
||||
182
hermes_code/website/docs/developer-guide/acp-internals.md
Normal file
182
hermes_code/website/docs/developer-guide/acp-internals.md
Normal file
|
|
@ -0,0 +1,182 @@
|
|||
---
|
||||
sidebar_position: 2
|
||||
title: "ACP Internals"
|
||||
description: "How the ACP adapter works: lifecycle, sessions, event bridge, approvals, and tool rendering"
|
||||
---
|
||||
|
||||
# ACP Internals
|
||||
|
||||
The ACP adapter wraps Hermes' synchronous `AIAgent` in an async JSON-RPC stdio server.
|
||||
|
||||
Key implementation files:
|
||||
|
||||
- `acp_adapter/entry.py`
|
||||
- `acp_adapter/server.py`
|
||||
- `acp_adapter/session.py`
|
||||
- `acp_adapter/events.py`
|
||||
- `acp_adapter/permissions.py`
|
||||
- `acp_adapter/tools.py`
|
||||
- `acp_adapter/auth.py`
|
||||
- `acp_registry/agent.json`
|
||||
|
||||
## Boot flow
|
||||
|
||||
```text
|
||||
hermes acp / hermes-acp / python -m acp_adapter
|
||||
-> acp_adapter.entry.main()
|
||||
-> load ~/.hermes/.env
|
||||
-> configure stderr logging
|
||||
-> construct HermesACPAgent
|
||||
-> acp.run_agent(agent)
|
||||
```
|
||||
|
||||
Stdout is reserved for ACP JSON-RPC transport. Human-readable logs go to stderr.
|
||||
|
||||
## Major components
|
||||
|
||||
### `HermesACPAgent`
|
||||
|
||||
`acp_adapter/server.py` implements the ACP agent protocol.
|
||||
|
||||
Responsibilities:
|
||||
|
||||
- initialize / authenticate
|
||||
- new/load/resume/fork/list/cancel session methods
|
||||
- prompt execution
|
||||
- session model switching
|
||||
- wiring sync AIAgent callbacks into ACP async notifications
|
||||
|
||||
### `SessionManager`
|
||||
|
||||
`acp_adapter/session.py` tracks live ACP sessions.
|
||||
|
||||
Each session stores:
|
||||
|
||||
- `session_id`
|
||||
- `agent`
|
||||
- `cwd`
|
||||
- `model`
|
||||
- `history`
|
||||
- `cancel_event`
|
||||
|
||||
The manager is thread-safe and supports:
|
||||
|
||||
- create
|
||||
- get
|
||||
- remove
|
||||
- fork
|
||||
- list
|
||||
- cleanup
|
||||
- cwd updates
|
||||
|
||||
### Event bridge
|
||||
|
||||
`acp_adapter/events.py` converts AIAgent callbacks into ACP `session_update` events.
|
||||
|
||||
Bridged callbacks:
|
||||
|
||||
- `tool_progress_callback`
|
||||
- `thinking_callback`
|
||||
- `step_callback`
|
||||
- `message_callback`
|
||||
|
||||
Because `AIAgent` runs in a worker thread while ACP I/O lives on the main event loop, the bridge uses:
|
||||
|
||||
```python
|
||||
asyncio.run_coroutine_threadsafe(...)
|
||||
```
|
||||
|
||||
### Permission bridge
|
||||
|
||||
`acp_adapter/permissions.py` adapts dangerous terminal approval prompts into ACP permission requests.
|
||||
|
||||
Mapping:
|
||||
|
||||
- `allow_once` -> Hermes `once`
|
||||
- `allow_always` -> Hermes `always`
|
||||
- reject options -> Hermes `deny`
|
||||
|
||||
Timeouts and bridge failures deny by default.
|
||||
|
||||
### Tool rendering helpers
|
||||
|
||||
`acp_adapter/tools.py` maps Hermes tools to ACP tool kinds and builds editor-facing content.
|
||||
|
||||
Examples:
|
||||
|
||||
- `patch` / `write_file` -> file diffs
|
||||
- `terminal` -> shell command text
|
||||
- `read_file` / `search_files` -> text previews
|
||||
- large results -> truncated text blocks for UI safety
|
||||
|
||||
## Session lifecycle
|
||||
|
||||
```text
|
||||
new_session(cwd)
|
||||
-> create SessionState
|
||||
-> create AIAgent(platform="acp", enabled_toolsets=["hermes-acp"])
|
||||
-> bind task_id/session_id to cwd override
|
||||
|
||||
prompt(..., session_id)
|
||||
-> extract text from ACP content blocks
|
||||
-> reset cancel event
|
||||
-> install callbacks + approval bridge
|
||||
-> run AIAgent in ThreadPoolExecutor
|
||||
-> update session history
|
||||
-> emit final agent message chunk
|
||||
```
|
||||
|
||||
### Cancelation
|
||||
|
||||
`cancel(session_id)`:
|
||||
|
||||
- sets the session cancel event
|
||||
- calls `agent.interrupt()` when available
|
||||
- causes the prompt response to return `stop_reason="cancelled"`
|
||||
|
||||
### Forking
|
||||
|
||||
`fork_session()` deep-copies message history into a new live session, preserving conversation state while giving the fork its own session ID and cwd.
|
||||
|
||||
## Provider/auth behavior
|
||||
|
||||
ACP does not implement its own auth store.
|
||||
|
||||
Instead it reuses Hermes' runtime resolver:
|
||||
|
||||
- `acp_adapter/auth.py`
|
||||
- `hermes_cli/runtime_provider.py`
|
||||
|
||||
So ACP advertises and uses the currently configured Hermes provider/credentials.
|
||||
|
||||
## Working directory binding
|
||||
|
||||
ACP sessions carry an editor cwd.
|
||||
|
||||
The session manager binds that cwd to the ACP session ID via task-scoped terminal/file overrides, so file and terminal tools operate relative to the editor workspace.
|
||||
|
||||
## Duplicate same-name tool calls
|
||||
|
||||
The event bridge tracks tool IDs FIFO per tool name, not just one ID per name. This is important for:
|
||||
|
||||
- parallel same-name calls
|
||||
- repeated same-name calls in one step
|
||||
|
||||
Without FIFO queues, completion events would attach to the wrong tool invocation.
|
||||
|
||||
## Approval callback restoration
|
||||
|
||||
ACP temporarily installs an approval callback on the terminal tool during prompt execution, then restores the previous callback afterward. This avoids leaving ACP session-specific approval handlers installed globally forever.
|
||||
|
||||
## Current limitations
|
||||
|
||||
- ACP sessions are process-local from the ACP server's point of view
|
||||
- non-text prompt blocks are currently ignored for request text extraction
|
||||
- editor-specific UX varies by ACP client implementation
|
||||
|
||||
## Related files
|
||||
|
||||
- `tests/acp/` — ACP test suite
|
||||
- `toolsets.py` — `hermes-acp` toolset definition
|
||||
- `hermes_cli/main.py` — `hermes acp` CLI subcommand
|
||||
- `pyproject.toml` — `[acp]` optional dependency + `hermes-acp` script
|
||||
424
hermes_code/website/docs/developer-guide/adding-providers.md
Normal file
424
hermes_code/website/docs/developer-guide/adding-providers.md
Normal file
|
|
@ -0,0 +1,424 @@
|
|||
---
|
||||
sidebar_position: 5
|
||||
title: "Adding Providers"
|
||||
description: "How to add a new inference provider to Hermes Agent — auth, runtime resolution, CLI flows, adapters, tests, and docs"
|
||||
---
|
||||
|
||||
# Adding Providers
|
||||
|
||||
Hermes can already talk to any OpenAI-compatible endpoint through the custom provider path. Do not add a built-in provider unless you want first-class UX for that service:
|
||||
|
||||
- provider-specific auth or token refresh
|
||||
- a curated model catalog
|
||||
- setup / `hermes model` menu entries
|
||||
- provider aliases for `provider:model` syntax
|
||||
- a non-OpenAI API shape that needs an adapter
|
||||
|
||||
If the provider is just "another OpenAI-compatible base URL and API key", a named custom provider may be enough.
|
||||
|
||||
## The mental model
|
||||
|
||||
A built-in provider has to line up across a few layers:
|
||||
|
||||
1. `hermes_cli/auth.py` decides how credentials are found.
|
||||
2. `hermes_cli/runtime_provider.py` turns that into runtime data:
|
||||
- `provider`
|
||||
- `api_mode`
|
||||
- `base_url`
|
||||
- `api_key`
|
||||
- `source`
|
||||
3. `run_agent.py` uses `api_mode` to decide how requests are built and sent.
|
||||
4. `hermes_cli/models.py`, `hermes_cli/main.py`, and `hermes_cli/setup.py` make the provider show up in the CLI.
|
||||
5. `agent/auxiliary_client.py` and `agent/model_metadata.py` keep side tasks and token budgeting working.
|
||||
|
||||
The important abstraction is `api_mode`.
|
||||
|
||||
- Most providers use `chat_completions`.
|
||||
- Codex uses `codex_responses`.
|
||||
- Anthropic uses `anthropic_messages`.
|
||||
- A new non-OpenAI protocol usually means adding a new adapter and a new `api_mode` branch.
|
||||
|
||||
## Choose the implementation path first
|
||||
|
||||
### Path A — OpenAI-compatible provider
|
||||
|
||||
Use this when the provider accepts standard chat-completions style requests.
|
||||
|
||||
Typical work:
|
||||
|
||||
- add auth metadata
|
||||
- add model catalog / aliases
|
||||
- add runtime resolution
|
||||
- add CLI menu wiring
|
||||
- add aux-model defaults
|
||||
- add tests and user docs
|
||||
|
||||
You usually do not need a new adapter or a new `api_mode`.
|
||||
|
||||
### Path B — Native provider
|
||||
|
||||
Use this when the provider does not behave like OpenAI chat completions.
|
||||
|
||||
Examples in-tree today:
|
||||
|
||||
- `codex_responses`
|
||||
- `anthropic_messages`
|
||||
|
||||
This path includes everything from Path A plus:
|
||||
|
||||
- a provider adapter in `agent/`
|
||||
- `run_agent.py` branches for request building, dispatch, usage extraction, interrupt handling, and response normalization
|
||||
- adapter tests
|
||||
|
||||
## File checklist
|
||||
|
||||
### Required for every built-in provider
|
||||
|
||||
1. `hermes_cli/auth.py`
|
||||
2. `hermes_cli/models.py`
|
||||
3. `hermes_cli/runtime_provider.py`
|
||||
4. `hermes_cli/main.py`
|
||||
5. `hermes_cli/setup.py`
|
||||
6. `agent/auxiliary_client.py`
|
||||
7. `agent/model_metadata.py`
|
||||
8. tests
|
||||
9. user-facing docs under `website/docs/`
|
||||
|
||||
### Additional for native / non-OpenAI providers
|
||||
|
||||
10. `agent/<provider>_adapter.py`
|
||||
11. `run_agent.py`
|
||||
12. `pyproject.toml` if a provider SDK is required
|
||||
|
||||
## Step 1: Pick one canonical provider id
|
||||
|
||||
Choose a single provider id and use it everywhere.
|
||||
|
||||
Examples from the repo:
|
||||
|
||||
- `openai-codex`
|
||||
- `kimi-coding`
|
||||
- `minimax-cn`
|
||||
|
||||
That same id should appear in:
|
||||
|
||||
- `PROVIDER_REGISTRY` in `hermes_cli/auth.py`
|
||||
- `_PROVIDER_LABELS` in `hermes_cli/models.py`
|
||||
- `_PROVIDER_ALIASES` in both `hermes_cli/auth.py` and `hermes_cli/models.py`
|
||||
- CLI `--provider` choices in `hermes_cli/main.py`
|
||||
- setup / model selection branches
|
||||
- auxiliary-model defaults
|
||||
- tests
|
||||
|
||||
If the id differs between those files, the provider will feel half-wired: auth may work while `/model`, setup, or runtime resolution silently misses it.
|
||||
|
||||
## Step 2: Add auth metadata in `hermes_cli/auth.py`
|
||||
|
||||
For API-key providers, add a `ProviderConfig` entry to `PROVIDER_REGISTRY` with:
|
||||
|
||||
- `id`
|
||||
- `name`
|
||||
- `auth_type="api_key"`
|
||||
- `inference_base_url`
|
||||
- `api_key_env_vars`
|
||||
- optional `base_url_env_var`
|
||||
|
||||
Also add aliases to `_PROVIDER_ALIASES`.
|
||||
|
||||
Use the existing providers as templates:
|
||||
|
||||
- simple API-key path: Z.AI, MiniMax
|
||||
- API-key path with endpoint detection: Kimi, Z.AI
|
||||
- native token resolution: Anthropic
|
||||
- OAuth / auth-store path: Nous, OpenAI Codex
|
||||
|
||||
Questions to answer here:
|
||||
|
||||
- What env vars should Hermes check, and in what priority order?
|
||||
- Does the provider need base-URL overrides?
|
||||
- Does it need endpoint probing or token refresh?
|
||||
- What should the auth error say when credentials are missing?
|
||||
|
||||
If the provider needs something more than "look up an API key", add a dedicated credential resolver instead of shoving logic into unrelated branches.
|
||||
|
||||
## Step 3: Add model catalog and aliases in `hermes_cli/models.py`
|
||||
|
||||
Update the provider catalog so the provider works in menus and in `provider:model` syntax.
|
||||
|
||||
Typical edits:
|
||||
|
||||
- `_PROVIDER_MODELS`
|
||||
- `_PROVIDER_LABELS`
|
||||
- `_PROVIDER_ALIASES`
|
||||
- provider display order inside `list_available_providers()`
|
||||
- `provider_model_ids()` if the provider supports a live `/models` fetch
|
||||
|
||||
If the provider exposes a live model list, prefer that first and keep `_PROVIDER_MODELS` as the static fallback.
|
||||
|
||||
This file is also what makes inputs like these work:
|
||||
|
||||
```text
|
||||
anthropic:claude-sonnet-4-6
|
||||
kimi:model-name
|
||||
```
|
||||
|
||||
If aliases are missing here, the provider may authenticate correctly but still fail in `/model` parsing.
|
||||
|
||||
## Step 4: Resolve runtime data in `hermes_cli/runtime_provider.py`
|
||||
|
||||
`resolve_runtime_provider()` is the shared path used by CLI, gateway, cron, ACP, and helper clients.
|
||||
|
||||
Add a branch that returns a dict with at least:
|
||||
|
||||
```python
|
||||
{
|
||||
"provider": "your-provider",
|
||||
"api_mode": "chat_completions", # or your native mode
|
||||
"base_url": "https://...",
|
||||
"api_key": "...",
|
||||
"source": "env|portal|auth-store|explicit",
|
||||
"requested_provider": requested_provider,
|
||||
}
|
||||
```
|
||||
|
||||
If the provider is OpenAI-compatible, `api_mode` should usually stay `chat_completions`.
|
||||
|
||||
Be careful with API-key precedence. Hermes already contains logic to avoid leaking an OpenRouter key to unrelated endpoints. A new provider should be equally explicit about which key goes to which base URL.
|
||||
|
||||
## Step 5: Wire the CLI in `hermes_cli/main.py` and `hermes_cli/setup.py`
|
||||
|
||||
A provider is not discoverable until it shows up in the interactive flows.
|
||||
|
||||
Update:
|
||||
|
||||
### `hermes_cli/main.py`
|
||||
|
||||
- `provider_labels`
|
||||
- provider dispatch inside the `model` command
|
||||
- `--provider` argument choices
|
||||
- login/logout choices if the provider supports those flows
|
||||
- a `_model_flow_<provider>()` function, or reuse `_model_flow_api_key_provider()` if it fits
|
||||
|
||||
### `hermes_cli/setup.py`
|
||||
|
||||
- `provider_choices`
|
||||
- auth branch for the provider
|
||||
- model-selection branch
|
||||
- any provider-specific explanatory text
|
||||
- any place where a provider should be excluded from OpenRouter-only prompts or routing settings
|
||||
|
||||
If you only update one of these files, `hermes model` and `hermes setup` will drift.
|
||||
|
||||
## Step 6: Keep auxiliary calls working
|
||||
|
||||
Two files matter here:
|
||||
|
||||
### `agent/auxiliary_client.py`
|
||||
|
||||
Add a cheap / fast default aux model to `_API_KEY_PROVIDER_AUX_MODELS` if this is a direct API-key provider.
|
||||
|
||||
Auxiliary tasks include things like:
|
||||
|
||||
- vision summarization
|
||||
- web extraction summarization
|
||||
- context compression summaries
|
||||
- session-search summaries
|
||||
- memory flushes
|
||||
|
||||
If the provider has no sensible aux default, side tasks may fall back badly or use an expensive main model unexpectedly.
|
||||
|
||||
### `agent/model_metadata.py`
|
||||
|
||||
Add context lengths for the provider's models so token budgeting, compression thresholds, and limits stay sane.
|
||||
|
||||
## Step 7: If the provider is native, add an adapter and `run_agent.py` support
|
||||
|
||||
If the provider is not plain chat completions, isolate the provider-specific logic in `agent/<provider>_adapter.py`.
|
||||
|
||||
Keep `run_agent.py` focused on orchestration. It should call adapter helpers, not hand-build provider payloads inline all over the file.
|
||||
|
||||
A native provider usually needs work in these places:
|
||||
|
||||
### New adapter file
|
||||
|
||||
Typical responsibilities:
|
||||
|
||||
- build the SDK / HTTP client
|
||||
- resolve tokens
|
||||
- convert OpenAI-style conversation messages to the provider's request format
|
||||
- convert tool schemas if needed
|
||||
- normalize provider responses back into what `run_agent.py` expects
|
||||
- extract usage and finish-reason data
|
||||
|
||||
### `run_agent.py`
|
||||
|
||||
Search for `api_mode` and audit every switch point. At minimum, verify:
|
||||
|
||||
- `__init__` chooses the new `api_mode`
|
||||
- client construction works for the provider
|
||||
- `_build_api_kwargs()` knows how to format requests
|
||||
- `_api_call_with_interrupt()` dispatches to the right client call
|
||||
- interrupt / client rebuild paths work
|
||||
- response validation accepts the provider's shape
|
||||
- finish-reason extraction is correct
|
||||
- token-usage extraction is correct
|
||||
- fallback-model activation can switch into the new provider cleanly
|
||||
- summary-generation and memory-flush paths still work
|
||||
|
||||
Also search `run_agent.py` for `self.client.`. Any code path that assumes the standard OpenAI client exists can break when a native provider uses a different client object or `self.client = None`.
|
||||
|
||||
### Prompt caching and provider-specific request fields
|
||||
|
||||
Prompt caching and provider-specific knobs are easy to regress.
|
||||
|
||||
Examples already in-tree:
|
||||
|
||||
- Anthropic has a native prompt-caching path
|
||||
- OpenRouter gets provider-routing fields
|
||||
- not every provider should receive every request-side option
|
||||
|
||||
When you add a native provider, double-check that Hermes is only sending fields that provider actually understands.
|
||||
|
||||
## Step 8: Tests
|
||||
|
||||
At minimum, touch the tests that guard provider wiring.
|
||||
|
||||
Common places:
|
||||
|
||||
- `tests/test_runtime_provider_resolution.py`
|
||||
- `tests/test_cli_provider_resolution.py`
|
||||
- `tests/test_cli_model_command.py`
|
||||
- `tests/test_setup_model_selection.py`
|
||||
- `tests/test_provider_parity.py`
|
||||
- `tests/test_run_agent.py`
|
||||
- `tests/test_<provider>_adapter.py` for a native provider
|
||||
|
||||
For docs-only examples, the exact file set may differ. The point is to cover:
|
||||
|
||||
- auth resolution
|
||||
- CLI menu / provider selection
|
||||
- runtime provider resolution
|
||||
- agent execution path
|
||||
- provider:model parsing
|
||||
- any adapter-specific message conversion
|
||||
|
||||
Run tests with xdist disabled:
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
python -m pytest tests/test_runtime_provider_resolution.py tests/test_cli_provider_resolution.py tests/test_cli_model_command.py tests/test_setup_model_selection.py -n0 -q
|
||||
```
|
||||
|
||||
For deeper changes, run the full suite before pushing:
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
python -m pytest tests/ -n0 -q
|
||||
```
|
||||
|
||||
## Step 9: Live verification
|
||||
|
||||
After tests, run a real smoke test.
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
python -m hermes_cli.main chat -q "Say hello" --provider your-provider --model your-model
|
||||
```
|
||||
|
||||
Also test the interactive flows if you changed menus:
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
python -m hermes_cli.main model
|
||||
python -m hermes_cli.main setup
|
||||
```
|
||||
|
||||
For native providers, verify at least one tool call too, not just a plain text response.
|
||||
|
||||
## Step 10: Update user-facing docs
|
||||
|
||||
If the provider is meant to ship as a first-class option, update the user docs too:
|
||||
|
||||
- `website/docs/getting-started/quickstart.md`
|
||||
- `website/docs/user-guide/configuration.md`
|
||||
- `website/docs/reference/environment-variables.md`
|
||||
|
||||
A developer can wire the provider perfectly and still leave users unable to discover the required env vars or setup flow.
|
||||
|
||||
## OpenAI-compatible provider checklist
|
||||
|
||||
Use this if the provider is standard chat completions.
|
||||
|
||||
- [ ] `ProviderConfig` added in `hermes_cli/auth.py`
|
||||
- [ ] aliases added in `hermes_cli/auth.py` and `hermes_cli/models.py`
|
||||
- [ ] model catalog added in `hermes_cli/models.py`
|
||||
- [ ] runtime branch added in `hermes_cli/runtime_provider.py`
|
||||
- [ ] CLI wiring added in `hermes_cli/main.py`
|
||||
- [ ] setup wiring added in `hermes_cli/setup.py`
|
||||
- [ ] aux model added in `agent/auxiliary_client.py`
|
||||
- [ ] context lengths added in `agent/model_metadata.py`
|
||||
- [ ] runtime / CLI tests updated
|
||||
- [ ] user docs updated
|
||||
|
||||
## Native provider checklist
|
||||
|
||||
Use this when the provider needs a new protocol path.
|
||||
|
||||
- [ ] everything in the OpenAI-compatible checklist
|
||||
- [ ] adapter added in `agent/<provider>_adapter.py`
|
||||
- [ ] new `api_mode` supported in `run_agent.py`
|
||||
- [ ] interrupt / rebuild path works
|
||||
- [ ] usage and finish-reason extraction works
|
||||
- [ ] fallback path works
|
||||
- [ ] adapter tests added
|
||||
- [ ] live smoke test passes
|
||||
|
||||
## Common pitfalls
|
||||
|
||||
### 1. Adding the provider to auth but not to model parsing
|
||||
|
||||
That makes credentials resolve correctly while `/model` and `provider:model` inputs fail.
|
||||
|
||||
### 2. Forgetting that `config["model"]` can be a string or a dict
|
||||
|
||||
A lot of provider-selection code has to normalize both forms.
|
||||
|
||||
### 3. Assuming a built-in provider is required
|
||||
|
||||
If the service is just OpenAI-compatible, a custom provider may already solve the user problem with less maintenance.
|
||||
|
||||
### 4. Forgetting auxiliary paths
|
||||
|
||||
The main chat path can work while summarization, memory flushes, or vision helpers fail because aux routing was never updated.
|
||||
|
||||
### 5. Native-provider branches hiding in `run_agent.py`
|
||||
|
||||
Search for `api_mode` and `self.client.`. Do not assume the obvious request path is the only one.
|
||||
|
||||
### 6. Sending OpenRouter-only knobs to other providers
|
||||
|
||||
Fields like provider routing belong only on the providers that support them.
|
||||
|
||||
### 7. Updating `hermes model` but not `hermes setup`
|
||||
|
||||
Both flows need to know about the provider.
|
||||
|
||||
## Good search targets while implementing
|
||||
|
||||
If you are hunting for all the places a provider touches, search these symbols:
|
||||
|
||||
- `PROVIDER_REGISTRY`
|
||||
- `_PROVIDER_ALIASES`
|
||||
- `_PROVIDER_MODELS`
|
||||
- `resolve_runtime_provider`
|
||||
- `_model_flow_`
|
||||
- `provider_choices`
|
||||
- `api_mode`
|
||||
- `_API_KEY_PROVIDER_AUX_MODELS`
|
||||
- `self.client.`
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Provider Runtime Resolution](./provider-runtime.md)
|
||||
- [Architecture](./architecture.md)
|
||||
- [Contributing](./contributing.md)
|
||||
208
hermes_code/website/docs/developer-guide/adding-tools.md
Normal file
208
hermes_code/website/docs/developer-guide/adding-tools.md
Normal file
|
|
@ -0,0 +1,208 @@
|
|||
---
|
||||
sidebar_position: 2
|
||||
title: "Adding Tools"
|
||||
description: "How to add a new tool to Hermes Agent — schemas, handlers, registration, and toolsets"
|
||||
---
|
||||
|
||||
# Adding Tools
|
||||
|
||||
Before writing a tool, ask yourself: **should this be a [skill](creating-skills.md) instead?**
|
||||
|
||||
Make it a **Skill** when the capability can be expressed as instructions + shell commands + existing tools (arXiv search, git workflows, Docker management, PDF processing).
|
||||
|
||||
Make it a **Tool** when it requires end-to-end integration with API keys, custom processing logic, binary data handling, or streaming (browser automation, TTS, vision analysis).
|
||||
|
||||
## Overview
|
||||
|
||||
Adding a tool touches **3 files**:
|
||||
|
||||
1. **`tools/your_tool.py`** — handler, schema, check function, `registry.register()` call
|
||||
2. **`toolsets.py`** — add tool name to `_HERMES_CORE_TOOLS` (or a specific toolset)
|
||||
3. **`model_tools.py`** — add `"tools.your_tool"` to the `_discover_tools()` list
|
||||
|
||||
## Step 1: Create the Tool File
|
||||
|
||||
Every tool file follows the same structure:
|
||||
|
||||
```python
|
||||
# tools/weather_tool.py
|
||||
"""Weather Tool -- look up current weather for a location."""
|
||||
|
||||
import json
|
||||
import os
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# --- Availability check ---
|
||||
|
||||
def check_weather_requirements() -> bool:
|
||||
"""Return True if the tool's dependencies are available."""
|
||||
return bool(os.getenv("WEATHER_API_KEY"))
|
||||
|
||||
|
||||
# --- Handler ---
|
||||
|
||||
def weather_tool(location: str, units: str = "metric") -> str:
|
||||
"""Fetch weather for a location. Returns JSON string."""
|
||||
api_key = os.getenv("WEATHER_API_KEY")
|
||||
if not api_key:
|
||||
return json.dumps({"error": "WEATHER_API_KEY not configured"})
|
||||
try:
|
||||
# ... call weather API ...
|
||||
return json.dumps({"location": location, "temp": 22, "units": units})
|
||||
except Exception as e:
|
||||
return json.dumps({"error": str(e)})
|
||||
|
||||
|
||||
# --- Schema ---
|
||||
|
||||
WEATHER_SCHEMA = {
|
||||
"name": "weather",
|
||||
"description": "Get current weather for a location.",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"location": {
|
||||
"type": "string",
|
||||
"description": "City name or coordinates (e.g. 'London' or '51.5,-0.1')"
|
||||
},
|
||||
"units": {
|
||||
"type": "string",
|
||||
"enum": ["metric", "imperial"],
|
||||
"description": "Temperature units (default: metric)",
|
||||
"default": "metric"
|
||||
}
|
||||
},
|
||||
"required": ["location"]
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
# --- Registration ---
|
||||
|
||||
from tools.registry import registry
|
||||
|
||||
registry.register(
|
||||
name="weather",
|
||||
toolset="weather",
|
||||
schema=WEATHER_SCHEMA,
|
||||
handler=lambda args, **kw: weather_tool(
|
||||
location=args.get("location", ""),
|
||||
units=args.get("units", "metric")),
|
||||
check_fn=check_weather_requirements,
|
||||
requires_env=["WEATHER_API_KEY"],
|
||||
)
|
||||
```
|
||||
|
||||
### Key Rules
|
||||
|
||||
:::danger Important
|
||||
- Handlers **MUST** return a JSON string (via `json.dumps()`), never raw dicts
|
||||
- Errors **MUST** be returned as `{"error": "message"}`, never raised as exceptions
|
||||
- The `check_fn` is called when building tool definitions — if it returns `False`, the tool is silently excluded
|
||||
- The `handler` receives `(args: dict, **kwargs)` where `args` is the LLM's tool call arguments
|
||||
:::
|
||||
|
||||
## Step 2: Add to a Toolset
|
||||
|
||||
In `toolsets.py`, add the tool name:
|
||||
|
||||
```python
|
||||
# If it should be available on all platforms (CLI + messaging):
|
||||
_HERMES_CORE_TOOLS = [
|
||||
...
|
||||
"weather", # <-- add here
|
||||
]
|
||||
|
||||
# Or create a new standalone toolset:
|
||||
"weather": {
|
||||
"description": "Weather lookup tools",
|
||||
"tools": ["weather"],
|
||||
"includes": []
|
||||
},
|
||||
```
|
||||
|
||||
## Step 3: Add Discovery Import
|
||||
|
||||
In `model_tools.py`, add the module to the `_discover_tools()` list:
|
||||
|
||||
```python
|
||||
def _discover_tools():
|
||||
_modules = [
|
||||
...
|
||||
"tools.weather_tool", # <-- add here
|
||||
]
|
||||
```
|
||||
|
||||
This import triggers the `registry.register()` call at the bottom of your tool file.
|
||||
|
||||
## Async Handlers
|
||||
|
||||
If your handler needs async code, mark it with `is_async=True`:
|
||||
|
||||
```python
|
||||
async def weather_tool_async(location: str) -> str:
|
||||
async with aiohttp.ClientSession() as session:
|
||||
...
|
||||
return json.dumps(result)
|
||||
|
||||
registry.register(
|
||||
name="weather",
|
||||
toolset="weather",
|
||||
schema=WEATHER_SCHEMA,
|
||||
handler=lambda args, **kw: weather_tool_async(args.get("location", "")),
|
||||
check_fn=check_weather_requirements,
|
||||
is_async=True, # registry calls _run_async() automatically
|
||||
)
|
||||
```
|
||||
|
||||
The registry handles async bridging transparently — you never call `asyncio.run()` yourself.
|
||||
|
||||
## Handlers That Need task_id
|
||||
|
||||
Tools that manage per-session state receive `task_id` via `**kwargs`:
|
||||
|
||||
```python
|
||||
def _handle_weather(args, **kw):
|
||||
task_id = kw.get("task_id")
|
||||
return weather_tool(args.get("location", ""), task_id=task_id)
|
||||
|
||||
registry.register(
|
||||
name="weather",
|
||||
...
|
||||
handler=_handle_weather,
|
||||
)
|
||||
```
|
||||
|
||||
## Agent-Loop Intercepted Tools
|
||||
|
||||
Some tools (`todo`, `memory`, `session_search`, `delegate_task`) need access to per-session agent state. These are intercepted by `run_agent.py` before reaching the registry. The registry still holds their schemas, but `dispatch()` returns a fallback error if the intercept is bypassed.
|
||||
|
||||
## Optional: Setup Wizard Integration
|
||||
|
||||
If your tool requires an API key, add it to `hermes_cli/config.py`:
|
||||
|
||||
```python
|
||||
OPTIONAL_ENV_VARS = {
|
||||
...
|
||||
"WEATHER_API_KEY": {
|
||||
"description": "Weather API key for weather lookup",
|
||||
"prompt": "Weather API key",
|
||||
"url": "https://weatherapi.com/",
|
||||
"tools": ["weather"],
|
||||
"password": True,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
## Checklist
|
||||
|
||||
- [ ] Tool file created with handler, schema, check function, and registration
|
||||
- [ ] Added to appropriate toolset in `toolsets.py`
|
||||
- [ ] Discovery import added to `model_tools.py`
|
||||
- [ ] Handler returns JSON strings, errors returned as `{"error": "..."}`
|
||||
- [ ] Optional: API key added to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py`
|
||||
- [ ] Optional: Added to `toolset_distributions.py` for batch processing
|
||||
- [ ] Tested with `hermes chat -q "Use the weather tool for London"`
|
||||
112
hermes_code/website/docs/developer-guide/agent-loop.md
Normal file
112
hermes_code/website/docs/developer-guide/agent-loop.md
Normal file
|
|
@ -0,0 +1,112 @@
|
|||
---
|
||||
sidebar_position: 3
|
||||
title: "Agent Loop Internals"
|
||||
description: "Detailed walkthrough of AIAgent execution, API modes, tools, callbacks, and fallback behavior"
|
||||
---
|
||||
|
||||
# Agent Loop Internals
|
||||
|
||||
The core orchestration engine is `run_agent.py`'s `AIAgent`.
|
||||
|
||||
## Core responsibilities
|
||||
|
||||
`AIAgent` is responsible for:
|
||||
|
||||
- assembling the effective prompt and tool schemas
|
||||
- selecting the correct provider/API mode
|
||||
- making interruptible model calls
|
||||
- executing tool calls (sequentially or concurrently)
|
||||
- maintaining session history
|
||||
- handling compression, retries, and fallback models
|
||||
|
||||
## API modes
|
||||
|
||||
Hermes currently supports three API execution modes:
|
||||
|
||||
| API mode | Used for |
|
||||
|----------|----------|
|
||||
| `chat_completions` | OpenAI-compatible chat endpoints, including OpenRouter and most custom endpoints |
|
||||
| `codex_responses` | OpenAI Codex / Responses API path |
|
||||
| `anthropic_messages` | Native Anthropic Messages API |
|
||||
|
||||
The mode is resolved from explicit args, provider selection, and base URL heuristics.
|
||||
|
||||
## Turn lifecycle
|
||||
|
||||
```text
|
||||
run_conversation()
|
||||
-> generate effective task_id
|
||||
-> append current user message
|
||||
-> load or build cached system prompt
|
||||
-> maybe preflight-compress
|
||||
-> build api_messages
|
||||
-> inject ephemeral prompt layers
|
||||
-> apply prompt caching if appropriate
|
||||
-> make interruptible API call
|
||||
-> if tool calls: execute them, append tool results, loop
|
||||
-> if final text: persist, cleanup, return response
|
||||
```
|
||||
|
||||
## Interruptible API calls
|
||||
|
||||
Hermes wraps API requests so they can be interrupted from the CLI or gateway.
|
||||
|
||||
This matters because:
|
||||
|
||||
- the agent may be in a long LLM call
|
||||
- the user may send a new message mid-flight
|
||||
- background systems may need cancellation semantics
|
||||
|
||||
## Tool execution modes
|
||||
|
||||
Hermes uses two execution strategies:
|
||||
|
||||
- sequential execution for single or interactive tools
|
||||
- concurrent execution for multiple non-interactive tools
|
||||
|
||||
Concurrent tool execution preserves message/result ordering when reinserting tool responses into conversation history.
|
||||
|
||||
## Callback surfaces
|
||||
|
||||
`AIAgent` supports platform/integration callbacks such as:
|
||||
|
||||
- `tool_progress_callback`
|
||||
- `thinking_callback`
|
||||
- `reasoning_callback`
|
||||
- `clarify_callback`
|
||||
- `step_callback`
|
||||
- `stream_delta_callback`
|
||||
- `tool_gen_callback`
|
||||
- `status_callback`
|
||||
|
||||
These are how the CLI, gateway, and ACP integrations stream intermediate progress and interactive approval/clarification flows.
|
||||
|
||||
## Budget and fallback behavior
|
||||
|
||||
Hermes tracks a shared iteration budget across parent and subagents. It also injects budget pressure hints near the end of the available iteration window.
|
||||
|
||||
Fallback model support allows the agent to switch providers/models when the primary route fails in supported failure paths.
|
||||
|
||||
## Compression and persistence
|
||||
|
||||
Before and during long runs, Hermes may:
|
||||
|
||||
- flush memory before context loss
|
||||
- compress middle conversation turns
|
||||
- split the session lineage into a new session ID after compression
|
||||
- preserve recent context and structural tool-call/result consistency
|
||||
|
||||
## Key files to read next
|
||||
|
||||
- `run_agent.py`
|
||||
- `agent/prompt_builder.py`
|
||||
- `agent/context_compressor.py`
|
||||
- `agent/prompt_caching.py`
|
||||
- `model_tools.py`
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Provider Runtime Resolution](./provider-runtime.md)
|
||||
- [Prompt Assembly](./prompt-assembly.md)
|
||||
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
- [Tools Runtime](./tools-runtime.md)
|
||||
152
hermes_code/website/docs/developer-guide/architecture.md
Normal file
152
hermes_code/website/docs/developer-guide/architecture.md
Normal file
|
|
@ -0,0 +1,152 @@
|
|||
---
|
||||
sidebar_position: 1
|
||||
title: "Architecture"
|
||||
description: "Hermes Agent internals — major subsystems, execution paths, and where to read next"
|
||||
---
|
||||
|
||||
# Architecture
|
||||
|
||||
This page is the top-level map of Hermes Agent internals. The project has grown beyond a single monolithic loop, so the best way to understand it is by subsystem.
|
||||
|
||||
## High-level structure
|
||||
|
||||
```text
|
||||
hermes-agent/
|
||||
├── run_agent.py # AIAgent core loop
|
||||
├── cli.py # interactive terminal UI
|
||||
├── model_tools.py # tool discovery/orchestration
|
||||
├── toolsets.py # tool groupings and presets
|
||||
├── hermes_state.py # SQLite session/state database
|
||||
├── batch_runner.py # batch trajectory generation
|
||||
│
|
||||
├── agent/ # prompt building, compression, caching, metadata, trajectories
|
||||
├── hermes_cli/ # command entrypoints, auth, setup, models, config, doctor
|
||||
├── tools/ # tool implementations and terminal environments
|
||||
├── gateway/ # messaging gateway, session routing, delivery, pairing, hooks
|
||||
├── cron/ # scheduled job storage and scheduler
|
||||
├── honcho_integration/ # Honcho memory integration
|
||||
├── acp_adapter/ # ACP editor integration server
|
||||
├── acp_registry/ # ACP registry manifest + icon
|
||||
├── environments/ # Hermes RL / benchmark environment framework
|
||||
├── skills/ # bundled skills
|
||||
├── optional-skills/ # official optional skills
|
||||
└── tests/ # test suite
|
||||
```
|
||||
|
||||
## Recommended reading order
|
||||
|
||||
If you are new to the codebase, read in this order:
|
||||
|
||||
1. this page
|
||||
2. [Agent Loop Internals](./agent-loop.md)
|
||||
3. [Prompt Assembly](./prompt-assembly.md)
|
||||
4. [Provider Runtime Resolution](./provider-runtime.md)
|
||||
5. [Adding Providers](./adding-providers.md)
|
||||
6. [Tools Runtime](./tools-runtime.md)
|
||||
7. [Session Storage](./session-storage.md)
|
||||
8. [Gateway Internals](./gateway-internals.md)
|
||||
9. [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
10. [ACP Internals](./acp-internals.md)
|
||||
11. [Environments, Benchmarks & Data Generation](./environments.md)
|
||||
|
||||
## Major subsystems
|
||||
|
||||
### Agent loop
|
||||
|
||||
The core synchronous orchestration engine is `AIAgent` in `run_agent.py`.
|
||||
|
||||
It is responsible for:
|
||||
|
||||
- provider/API-mode selection
|
||||
- prompt construction
|
||||
- tool execution
|
||||
- retries and fallback
|
||||
- callbacks
|
||||
- compression and persistence
|
||||
|
||||
See [Agent Loop Internals](./agent-loop.md).
|
||||
|
||||
### Prompt system
|
||||
|
||||
Prompt-building logic is split between:
|
||||
|
||||
- `run_agent.py`
|
||||
- `agent/prompt_builder.py`
|
||||
- `agent/prompt_caching.py`
|
||||
- `agent/context_compressor.py`
|
||||
|
||||
See:
|
||||
|
||||
- [Prompt Assembly](./prompt-assembly.md)
|
||||
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
|
||||
### Provider/runtime resolution
|
||||
|
||||
Hermes has a shared runtime provider resolver used by CLI, gateway, cron, ACP, and auxiliary calls.
|
||||
|
||||
See [Provider Runtime Resolution](./provider-runtime.md).
|
||||
|
||||
### Tooling runtime
|
||||
|
||||
The tool registry, toolsets, terminal backends, process manager, and dispatch rules form a subsystem of their own.
|
||||
|
||||
See [Tools Runtime](./tools-runtime.md).
|
||||
|
||||
### Session persistence
|
||||
|
||||
Historical session state is stored primarily in SQLite, with lineage preserved across compression splits.
|
||||
|
||||
See [Session Storage](./session-storage.md).
|
||||
|
||||
### Messaging gateway
|
||||
|
||||
The gateway is a long-running orchestration layer for platform adapters, session routing, pairing, delivery, and cron ticking.
|
||||
|
||||
See [Gateway Internals](./gateway-internals.md).
|
||||
|
||||
### ACP integration
|
||||
|
||||
ACP exposes Hermes as an editor-native agent over stdio/JSON-RPC.
|
||||
|
||||
See:
|
||||
|
||||
- [ACP Editor Integration](../user-guide/features/acp.md)
|
||||
- [ACP Internals](./acp-internals.md)
|
||||
|
||||
### Cron
|
||||
|
||||
Cron jobs are implemented as first-class agent tasks, not just shell tasks.
|
||||
|
||||
See [Cron Internals](./cron-internals.md).
|
||||
|
||||
### RL / environments / trajectories
|
||||
|
||||
Hermes ships a full environment framework for evaluation, RL integration, and SFT data generation.
|
||||
|
||||
See:
|
||||
|
||||
- [Environments, Benchmarks & Data Generation](./environments.md)
|
||||
- [Trajectories & Training Format](./trajectory-format.md)
|
||||
|
||||
## Design themes
|
||||
|
||||
Several cross-cutting design themes appear throughout the codebase:
|
||||
|
||||
- prompt stability matters
|
||||
- tool execution must be observable and interruptible
|
||||
- session persistence must survive long-running use
|
||||
- platform frontends should share one agent core
|
||||
- optional subsystems should remain loosely coupled where possible
|
||||
|
||||
## Implementation notes
|
||||
|
||||
The older mental model of Hermes as “one OpenAI-compatible chat loop plus some tools” is no longer sufficient. Current Hermes includes:
|
||||
|
||||
- multiple API modes
|
||||
- auxiliary model routing
|
||||
- ACP editor integration
|
||||
- gateway-specific session and delivery semantics
|
||||
- RL environment infrastructure
|
||||
- prompt-caching and compression logic with lineage-aware persistence
|
||||
|
||||
Use this page as the map, then dive into subsystem-specific docs for the real implementation details.
|
||||
|
|
@ -0,0 +1,72 @@
|
|||
---
|
||||
sidebar_position: 6
|
||||
title: "Context Compression & Prompt Caching"
|
||||
description: "How Hermes compresses long conversations and applies provider-side prompt caching"
|
||||
---
|
||||
|
||||
# Context Compression & Prompt Caching
|
||||
|
||||
Hermes manages long conversations with two complementary mechanisms:
|
||||
|
||||
- prompt caching
|
||||
- context compression
|
||||
|
||||
Primary files:
|
||||
|
||||
- `agent/prompt_caching.py`
|
||||
- `agent/context_compressor.py`
|
||||
- `run_agent.py`
|
||||
|
||||
## Prompt caching
|
||||
|
||||
For Anthropic/native and Claude-via-OpenRouter flows, Hermes applies Anthropic-style cache markers.
|
||||
|
||||
Current strategy:
|
||||
|
||||
- cache the system prompt
|
||||
- cache the last 3 non-system messages
|
||||
- default TTL is 5 minutes unless explicitly extended
|
||||
|
||||
This is implemented in `agent/prompt_caching.py`.
|
||||
|
||||
## Why prompt stability matters
|
||||
|
||||
Prompt caching only helps when the stable prefix remains stable. That is why Hermes avoids rebuilding or mutating the core system prompt mid-session unless it has to.
|
||||
|
||||
## Compression trigger
|
||||
|
||||
Hermes can compress context when conversations become large. Configuration defaults live in `config.yaml`, and the compressor also has runtime checks based on actual prompt token counts.
|
||||
|
||||
## Compression algorithm
|
||||
|
||||
The compressor protects:
|
||||
|
||||
- the first N turns
|
||||
- the last N turns
|
||||
|
||||
and summarizes the middle section.
|
||||
|
||||
It also cleans up structural issues such as orphaned tool-call/result pairs so the API never receives invalid conversation structure after compression.
|
||||
|
||||
## Pre-compression memory flush
|
||||
|
||||
Before compression, Hermes can give the model one last chance to persist memory so facts are not lost when middle turns are summarized away.
|
||||
|
||||
## Session lineage after compression
|
||||
|
||||
Compression can split the session into a new session ID while preserving parent lineage in the state DB.
|
||||
|
||||
This lets Hermes continue operating with a smaller active context while retaining a searchable ancestry chain.
|
||||
|
||||
## Re-injected state after compression
|
||||
|
||||
After compression, Hermes may re-inject compact operational state such as:
|
||||
|
||||
- todo snapshot
|
||||
- prior-read-files summary
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Prompt Assembly](./prompt-assembly.md)
|
||||
- [Session Storage](./session-storage.md)
|
||||
- [Agent Loop Internals](./agent-loop.md)
|
||||
232
hermes_code/website/docs/developer-guide/contributing.md
Normal file
232
hermes_code/website/docs/developer-guide/contributing.md
Normal file
|
|
@ -0,0 +1,232 @@
|
|||
---
|
||||
sidebar_position: 4
|
||||
title: "Contributing"
|
||||
description: "How to contribute to Hermes Agent — dev setup, code style, PR process"
|
||||
---
|
||||
|
||||
# Contributing
|
||||
|
||||
Thank you for contributing to Hermes Agent! This guide covers setting up your dev environment, understanding the codebase, and getting your PR merged.
|
||||
|
||||
## Contribution Priorities
|
||||
|
||||
We value contributions in this order:
|
||||
|
||||
1. **Bug fixes** — crashes, incorrect behavior, data loss
|
||||
2. **Cross-platform compatibility** — macOS, different Linux distros, WSL2
|
||||
3. **Security hardening** — shell injection, prompt injection, path traversal
|
||||
4. **Performance and robustness** — retry logic, error handling, graceful degradation
|
||||
5. **New skills** — broadly useful ones (see [Creating Skills](creating-skills.md))
|
||||
6. **New tools** — rarely needed; most capabilities should be skills
|
||||
7. **Documentation** — fixes, clarifications, new examples
|
||||
|
||||
## Common contribution paths
|
||||
|
||||
- Building a new tool? Start with [Adding Tools](./adding-tools.md)
|
||||
- Building a new skill? Start with [Creating Skills](./creating-skills.md)
|
||||
- Building a new inference provider? Start with [Adding Providers](./adding-providers.md)
|
||||
|
||||
## Development Setup
|
||||
|
||||
### Prerequisites
|
||||
|
||||
| Requirement | Notes |
|
||||
|-------------|-------|
|
||||
| **Git** | With `--recurse-submodules` support |
|
||||
| **Python 3.10+** | uv will install it if missing |
|
||||
| **uv** | Fast Python package manager ([install](https://docs.astral.sh/uv/)) |
|
||||
| **Node.js 18+** | Optional — needed for browser tools and WhatsApp bridge |
|
||||
|
||||
### Clone and Install
|
||||
|
||||
```bash
|
||||
git clone --recurse-submodules https://github.com/NousResearch/hermes-agent.git
|
||||
cd hermes-agent
|
||||
|
||||
# Create venv with Python 3.11
|
||||
uv venv venv --python 3.11
|
||||
export VIRTUAL_ENV="$(pwd)/venv"
|
||||
|
||||
# Install with all extras (messaging, cron, CLI menus, dev tools)
|
||||
uv pip install -e ".[all,dev]"
|
||||
uv pip install -e "./tinker-atropos"
|
||||
|
||||
# Optional: browser tools
|
||||
npm install
|
||||
```
|
||||
|
||||
### Configure for Development
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.hermes/{cron,sessions,logs,memories,skills}
|
||||
cp cli-config.yaml.example ~/.hermes/config.yaml
|
||||
touch ~/.hermes/.env
|
||||
|
||||
# Add at minimum an LLM provider key:
|
||||
echo 'OPENROUTER_API_KEY=sk-or-v1-your-key' >> ~/.hermes/.env
|
||||
```
|
||||
|
||||
### Run
|
||||
|
||||
```bash
|
||||
# Symlink for global access
|
||||
mkdir -p ~/.local/bin
|
||||
ln -sf "$(pwd)/venv/bin/hermes" ~/.local/bin/hermes
|
||||
|
||||
# Verify
|
||||
hermes doctor
|
||||
hermes chat -q "Hello"
|
||||
```
|
||||
|
||||
### Run Tests
|
||||
|
||||
```bash
|
||||
pytest tests/ -v
|
||||
```
|
||||
|
||||
## Code Style
|
||||
|
||||
- **PEP 8** with practical exceptions (no strict line length enforcement)
|
||||
- **Comments**: Only when explaining non-obvious intent, trade-offs, or API quirks
|
||||
- **Error handling**: Catch specific exceptions. Use `logger.warning()`/`logger.error()` with `exc_info=True` for unexpected errors
|
||||
- **Cross-platform**: Never assume Unix (see below)
|
||||
|
||||
## Cross-Platform Compatibility
|
||||
|
||||
Hermes officially supports Linux, macOS, and WSL2. Native Windows is **not supported**, but the codebase includes some defensive coding patterns to avoid hard crashes in edge cases. Key rules:
|
||||
|
||||
### 1. `termios` and `fcntl` are Unix-only
|
||||
|
||||
Always catch both `ImportError` and `NotImplementedError`:
|
||||
|
||||
```python
|
||||
try:
|
||||
from simple_term_menu import TerminalMenu
|
||||
menu = TerminalMenu(options)
|
||||
idx = menu.show()
|
||||
except (ImportError, NotImplementedError):
|
||||
# Fallback: numbered menu
|
||||
for i, opt in enumerate(options):
|
||||
print(f" {i+1}. {opt}")
|
||||
idx = int(input("Choice: ")) - 1
|
||||
```
|
||||
|
||||
### 2. File encoding
|
||||
|
||||
Some environments may save `.env` files in non-UTF-8 encodings:
|
||||
|
||||
```python
|
||||
try:
|
||||
load_dotenv(env_path)
|
||||
except UnicodeDecodeError:
|
||||
load_dotenv(env_path, encoding="latin-1")
|
||||
```
|
||||
|
||||
### 3. Process management
|
||||
|
||||
`os.setsid()`, `os.killpg()`, and signal handling differ across platforms:
|
||||
|
||||
```python
|
||||
import platform
|
||||
if platform.system() != "Windows":
|
||||
kwargs["preexec_fn"] = os.setsid
|
||||
```
|
||||
|
||||
### 4. Path separators
|
||||
|
||||
Use `pathlib.Path` instead of string concatenation with `/`.
|
||||
|
||||
## Security Considerations
|
||||
|
||||
Hermes has terminal access. Security matters.
|
||||
|
||||
### Existing Protections
|
||||
|
||||
| Layer | Implementation |
|
||||
|-------|---------------|
|
||||
| **Sudo password piping** | Uses `shlex.quote()` to prevent shell injection |
|
||||
| **Dangerous command detection** | Regex patterns in `tools/approval.py` with user approval flow |
|
||||
| **Cron prompt injection** | Scanner blocks instruction-override patterns |
|
||||
| **Write deny list** | Protected paths resolved via `os.path.realpath()` to prevent symlink bypass |
|
||||
| **Skills guard** | Security scanner for hub-installed skills |
|
||||
| **Code execution sandbox** | Child process runs with API keys stripped |
|
||||
| **Container hardening** | Docker: all capabilities dropped, no privilege escalation, PID limits |
|
||||
|
||||
### Contributing Security-Sensitive Code
|
||||
|
||||
- Always use `shlex.quote()` when interpolating user input into shell commands
|
||||
- Resolve symlinks with `os.path.realpath()` before access control checks
|
||||
- Don't log secrets
|
||||
- Catch broad exceptions around tool execution
|
||||
- Test on all platforms if your change touches file paths or processes
|
||||
|
||||
## Pull Request Process
|
||||
|
||||
### Branch Naming
|
||||
|
||||
```
|
||||
fix/description # Bug fixes
|
||||
feat/description # New features
|
||||
docs/description # Documentation
|
||||
test/description # Tests
|
||||
refactor/description # Code restructuring
|
||||
```
|
||||
|
||||
### Before Submitting
|
||||
|
||||
1. **Run tests**: `pytest tests/ -v`
|
||||
2. **Test manually**: Run `hermes` and exercise the code path you changed
|
||||
3. **Check cross-platform impact**: Consider macOS and different Linux distros
|
||||
4. **Keep PRs focused**: One logical change per PR
|
||||
|
||||
### PR Description
|
||||
|
||||
Include:
|
||||
- **What** changed and **why**
|
||||
- **How to test** it
|
||||
- **What platforms** you tested on
|
||||
- Reference any related issues
|
||||
|
||||
### Commit Messages
|
||||
|
||||
We use [Conventional Commits](https://www.conventionalcommits.org/):
|
||||
|
||||
```
|
||||
<type>(<scope>): <description>
|
||||
```
|
||||
|
||||
| Type | Use for |
|
||||
|------|---------|
|
||||
| `fix` | Bug fixes |
|
||||
| `feat` | New features |
|
||||
| `docs` | Documentation |
|
||||
| `test` | Tests |
|
||||
| `refactor` | Code restructuring |
|
||||
| `chore` | Build, CI, dependency updates |
|
||||
|
||||
Scopes: `cli`, `gateway`, `tools`, `skills`, `agent`, `install`, `whatsapp`, `security`
|
||||
|
||||
Examples:
|
||||
```
|
||||
fix(cli): prevent crash in save_config_value when model is a string
|
||||
feat(gateway): add WhatsApp multi-user session isolation
|
||||
fix(security): prevent shell injection in sudo password piping
|
||||
```
|
||||
|
||||
## Reporting Issues
|
||||
|
||||
- Use [GitHub Issues](https://github.com/NousResearch/hermes-agent/issues)
|
||||
- Include: OS, Python version, Hermes version (`hermes version`), full error traceback
|
||||
- Include steps to reproduce
|
||||
- Check existing issues before creating duplicates
|
||||
- For security vulnerabilities, please report privately
|
||||
|
||||
## Community
|
||||
|
||||
- **Discord**: [discord.gg/NousResearch](https://discord.gg/NousResearch)
|
||||
- **GitHub Discussions**: For design proposals and architecture discussions
|
||||
- **Skills Hub**: Upload specialized skills and share with the community
|
||||
|
||||
## License
|
||||
|
||||
By contributing, you agree that your contributions will be licensed under the [MIT License](https://github.com/NousResearch/hermes-agent/blob/main/LICENSE).
|
||||
247
hermes_code/website/docs/developer-guide/creating-skills.md
Normal file
247
hermes_code/website/docs/developer-guide/creating-skills.md
Normal file
|
|
@ -0,0 +1,247 @@
|
|||
---
|
||||
sidebar_position: 3
|
||||
title: "Creating Skills"
|
||||
description: "How to create skills for Hermes Agent — SKILL.md format, guidelines, and publishing"
|
||||
---
|
||||
|
||||
# Creating Skills
|
||||
|
||||
Skills are the preferred way to add new capabilities to Hermes Agent. They're easier to create than tools, require no code changes to the agent, and can be shared with the community.
|
||||
|
||||
## Should it be a Skill or a Tool?
|
||||
|
||||
Make it a **Skill** when:
|
||||
- The capability can be expressed as instructions + shell commands + existing tools
|
||||
- It wraps an external CLI or API that the agent can call via `terminal` or `web_extract`
|
||||
- It doesn't need custom Python integration or API key management baked into the agent
|
||||
- Examples: arXiv search, git workflows, Docker management, PDF processing, email via CLI tools
|
||||
|
||||
Make it a **Tool** when:
|
||||
- It requires end-to-end integration with API keys, auth flows, or multi-component configuration
|
||||
- It needs custom processing logic that must execute precisely every time
|
||||
- It handles binary data, streaming, or real-time events
|
||||
- Examples: browser automation, TTS, vision analysis
|
||||
|
||||
## Skill Directory Structure
|
||||
|
||||
Bundled skills live in `skills/` organized by category. Official optional skills use the same structure in `optional-skills/`:
|
||||
|
||||
```text
|
||||
skills/
|
||||
├── research/
|
||||
│ └── arxiv/
|
||||
│ ├── SKILL.md # Required: main instructions
|
||||
│ └── scripts/ # Optional: helper scripts
|
||||
│ └── search_arxiv.py
|
||||
├── productivity/
|
||||
│ └── ocr-and-documents/
|
||||
│ ├── SKILL.md
|
||||
│ ├── scripts/
|
||||
│ └── references/
|
||||
└── ...
|
||||
```
|
||||
|
||||
## SKILL.md Format
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: my-skill
|
||||
description: Brief description (shown in skill search results)
|
||||
version: 1.0.0
|
||||
author: Your Name
|
||||
license: MIT
|
||||
platforms: [macos, linux] # Optional — restrict to specific OS platforms
|
||||
# Valid: macos, linux, windows
|
||||
# Omit to load on all platforms (default)
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [Category, Subcategory, Keywords]
|
||||
related_skills: [other-skill-name]
|
||||
requires_toolsets: [web] # Optional — only show when these toolsets are active
|
||||
requires_tools: [web_search] # Optional — only show when these tools are available
|
||||
fallback_for_toolsets: [browser] # Optional — hide when these toolsets are active
|
||||
fallback_for_tools: [browser_navigate] # Optional — hide when these tools exist
|
||||
required_environment_variables: # Optional — env vars the skill needs
|
||||
- name: MY_API_KEY
|
||||
prompt: "Enter your API key"
|
||||
help: "Get one at https://example.com"
|
||||
required_for: "API access"
|
||||
---
|
||||
|
||||
# Skill Title
|
||||
|
||||
Brief intro.
|
||||
|
||||
## When to Use
|
||||
Trigger conditions — when should the agent load this skill?
|
||||
|
||||
## Quick Reference
|
||||
Table of common commands or API calls.
|
||||
|
||||
## Procedure
|
||||
Step-by-step instructions the agent follows.
|
||||
|
||||
## Pitfalls
|
||||
Known failure modes and how to handle them.
|
||||
|
||||
## Verification
|
||||
How the agent confirms it worked.
|
||||
```
|
||||
|
||||
### Platform-Specific Skills
|
||||
|
||||
Skills can restrict themselves to specific operating systems using the `platforms` field:
|
||||
|
||||
```yaml
|
||||
platforms: [macos] # macOS only (e.g., iMessage, Apple Reminders)
|
||||
platforms: [macos, linux] # macOS and Linux
|
||||
platforms: [windows] # Windows only
|
||||
```
|
||||
|
||||
When set, the skill is automatically hidden from the system prompt, `skills_list()`, and slash commands on incompatible platforms. If omitted or empty, the skill loads on all platforms (backward compatible).
|
||||
|
||||
### Conditional Skill Activation
|
||||
|
||||
Skills can declare dependencies on specific tools or toolsets. This controls whether the skill appears in the system prompt for a given session.
|
||||
|
||||
```yaml
|
||||
metadata:
|
||||
hermes:
|
||||
requires_toolsets: [web] # Hide if the web toolset is NOT active
|
||||
requires_tools: [web_search] # Hide if web_search tool is NOT available
|
||||
fallback_for_toolsets: [browser] # Hide if the browser toolset IS active
|
||||
fallback_for_tools: [browser_navigate] # Hide if browser_navigate IS available
|
||||
```
|
||||
|
||||
| Field | Behavior |
|
||||
|-------|----------|
|
||||
| `requires_toolsets` | Skill is **hidden** when ANY listed toolset is **not** available |
|
||||
| `requires_tools` | Skill is **hidden** when ANY listed tool is **not** available |
|
||||
| `fallback_for_toolsets` | Skill is **hidden** when ANY listed toolset **is** available |
|
||||
| `fallback_for_tools` | Skill is **hidden** when ANY listed tool **is** available |
|
||||
|
||||
**Use case for `fallback_for_*`:** Create a skill that serves as a workaround when a primary tool isn't available. For example, a `duckduckgo-search` skill with `fallback_for_tools: [web_search]` only shows when the web search tool (which requires an API key) is not configured.
|
||||
|
||||
**Use case for `requires_*`:** Create a skill that only makes sense when certain tools are present. For example, a web scraping workflow skill with `requires_toolsets: [web]` won't clutter the prompt when web tools are disabled.
|
||||
|
||||
### Environment Variable Requirements
|
||||
|
||||
Skills can declare environment variables they need. When a skill is loaded via `skill_view`, its required vars are automatically registered for passthrough into sandboxed execution environments (terminal, execute_code).
|
||||
|
||||
```yaml
|
||||
required_environment_variables:
|
||||
- name: TENOR_API_KEY
|
||||
prompt: "Tenor API key" # Shown when prompting user
|
||||
help: "Get your key at https://tenor.com" # Help text or URL
|
||||
required_for: "GIF search functionality" # What needs this var
|
||||
```
|
||||
|
||||
Each entry supports:
|
||||
- `name` (required) — the environment variable name
|
||||
- `prompt` (optional) — prompt text when asking the user for the value
|
||||
- `help` (optional) — help text or URL for obtaining the value
|
||||
- `required_for` (optional) — describes which feature needs this variable
|
||||
|
||||
Users can also manually configure passthrough variables in `config.yaml`:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
env_passthrough:
|
||||
- MY_CUSTOM_VAR
|
||||
- ANOTHER_VAR
|
||||
```
|
||||
|
||||
See `skills/apple/` for examples of macOS-only skills.
|
||||
|
||||
## Secure Setup on Load
|
||||
|
||||
Use `required_environment_variables` when a skill needs an API key or token. Missing values do **not** hide the skill from discovery. Instead, Hermes prompts for them securely when the skill is loaded in the local CLI.
|
||||
|
||||
```yaml
|
||||
required_environment_variables:
|
||||
- name: TENOR_API_KEY
|
||||
prompt: Tenor API key
|
||||
help: Get a key from https://developers.google.com/tenor
|
||||
required_for: full functionality
|
||||
```
|
||||
|
||||
The user can skip setup and keep loading the skill. Hermes never exposes the raw secret value to the model. Gateway and messaging sessions show local setup guidance instead of collecting secrets in-band.
|
||||
|
||||
:::tip Sandbox Passthrough
|
||||
When your skill is loaded, any declared `required_environment_variables` that are set are **automatically passed through** to `execute_code` and `terminal` sandboxes. Your skill's scripts can access `$TENOR_API_KEY` (or `os.environ["TENOR_API_KEY"]` in Python) without the user needing to configure anything extra. See [Environment Variable Passthrough](/docs/user-guide/security#environment-variable-passthrough) for details.
|
||||
:::
|
||||
|
||||
Legacy `prerequisites.env_vars` remains supported as a backward-compatible alias.
|
||||
|
||||
## Skill Guidelines
|
||||
|
||||
### No External Dependencies
|
||||
|
||||
Prefer stdlib Python, curl, and existing Hermes tools (`web_extract`, `terminal`, `read_file`). If a dependency is needed, document installation steps in the skill.
|
||||
|
||||
### Progressive Disclosure
|
||||
|
||||
Put the most common workflow first. Edge cases and advanced usage go at the bottom. This keeps token usage low for common tasks.
|
||||
|
||||
### Include Helper Scripts
|
||||
|
||||
For XML/JSON parsing or complex logic, include helper scripts in `scripts/` — don't expect the LLM to write parsers inline every time.
|
||||
|
||||
### Test It
|
||||
|
||||
Run the skill and verify the agent follows the instructions correctly:
|
||||
|
||||
```bash
|
||||
hermes chat --toolsets skills -q "Use the X skill to do Y"
|
||||
```
|
||||
|
||||
## Where Should the Skill Live?
|
||||
|
||||
Bundled skills (in `skills/`) ship with every Hermes install. They should be **broadly useful to most users**:
|
||||
|
||||
- Document handling, web research, common dev workflows, system administration
|
||||
- Used regularly by a wide range of people
|
||||
|
||||
If your skill is official and useful but not universally needed (e.g., a paid service integration, a heavyweight dependency), put it in **`optional-skills/`** — it ships with the repo, is discoverable via `hermes skills browse` (labeled "official"), and installs with builtin trust.
|
||||
|
||||
If your skill is specialized, community-contributed, or niche, it's better suited for a **Skills Hub** — upload it to a registry and share it via `hermes skills install`.
|
||||
|
||||
## Publishing Skills
|
||||
|
||||
### To the Skills Hub
|
||||
|
||||
```bash
|
||||
hermes skills publish skills/my-skill --to github --repo owner/repo
|
||||
```
|
||||
|
||||
### To a Custom Repository
|
||||
|
||||
Add your repo as a tap:
|
||||
|
||||
```bash
|
||||
hermes skills tap add owner/repo
|
||||
```
|
||||
|
||||
Users can then search and install from your repository.
|
||||
|
||||
## Security Scanning
|
||||
|
||||
All hub-installed skills go through a security scanner that checks for:
|
||||
|
||||
- Data exfiltration patterns
|
||||
- Prompt injection attempts
|
||||
- Destructive commands
|
||||
- Shell injection
|
||||
|
||||
Trust levels:
|
||||
- `builtin` — ships with Hermes (always trusted)
|
||||
- `official` — from `optional-skills/` in the repo (builtin trust, no third-party warning)
|
||||
- `trusted` — from openai/skills, anthropics/skills
|
||||
- `community` — non-dangerous findings can be overridden with `--force`; `dangerous` verdicts remain blocked
|
||||
|
||||
Hermes can now consume third-party skills from multiple external discovery models:
|
||||
- direct GitHub identifiers (for example `openai/skills/k8s`)
|
||||
- `skills.sh` identifiers (for example `skills-sh/vercel-labs/json-render/json-render-react`)
|
||||
- well-known endpoints served from `/.well-known/skills/index.json`
|
||||
|
||||
If you want your skills to be discoverable without a GitHub-specific installer, consider serving them from a well-known endpoint in addition to publishing them in a repo or marketplace.
|
||||
90
hermes_code/website/docs/developer-guide/cron-internals.md
Normal file
90
hermes_code/website/docs/developer-guide/cron-internals.md
Normal file
|
|
@ -0,0 +1,90 @@
|
|||
---
|
||||
sidebar_position: 11
|
||||
title: "Cron Internals"
|
||||
description: "How Hermes stores, schedules, edits, pauses, skill-loads, and delivers cron jobs"
|
||||
---
|
||||
|
||||
# Cron Internals
|
||||
|
||||
Hermes cron support is implemented primarily in:
|
||||
|
||||
- `cron/jobs.py`
|
||||
- `cron/scheduler.py`
|
||||
- `tools/cronjob_tools.py`
|
||||
- `gateway/run.py`
|
||||
- `hermes_cli/cron.py`
|
||||
|
||||
## Scheduling model
|
||||
|
||||
Hermes supports:
|
||||
|
||||
- one-shot delays
|
||||
- intervals
|
||||
- cron expressions
|
||||
- explicit timestamps
|
||||
|
||||
The model-facing surface is a single `cronjob` tool with action-style operations:
|
||||
|
||||
- `create`
|
||||
- `list`
|
||||
- `update`
|
||||
- `pause`
|
||||
- `resume`
|
||||
- `run`
|
||||
- `remove`
|
||||
|
||||
## Job storage
|
||||
|
||||
Cron jobs are stored in Hermes-managed local state (`~/.hermes/cron/jobs.json`) with atomic write semantics.
|
||||
|
||||
Each job can carry:
|
||||
|
||||
- prompt
|
||||
- schedule metadata
|
||||
- repeat counters
|
||||
- delivery target
|
||||
- lifecycle state (`scheduled`, `paused`, `completed`, etc.)
|
||||
- zero, one, or multiple attached skills
|
||||
|
||||
Backward compatibility is preserved for older jobs that only stored a legacy single `skill` field or none of the newer lifecycle fields.
|
||||
|
||||
## Runtime behavior
|
||||
|
||||
The scheduler:
|
||||
|
||||
- loads jobs
|
||||
- computes due work
|
||||
- executes jobs in fresh agent sessions
|
||||
- optionally injects one or more skills before the prompt
|
||||
- handles repeat counters
|
||||
- updates next-run metadata and state
|
||||
|
||||
In gateway mode, cron ticking is integrated into the long-running gateway loop.
|
||||
|
||||
## Skill-backed jobs
|
||||
|
||||
A cron job may attach multiple skills. At runtime, Hermes loads those skills in order and then appends the job prompt as the task instruction.
|
||||
|
||||
This gives scheduled jobs reusable guidance without requiring the user to paste full skill bodies into the cron prompt.
|
||||
|
||||
## Recursion guard
|
||||
|
||||
Cron-run sessions disable the `cronjob` toolset. This prevents a scheduled job from recursively creating or mutating more cron jobs and accidentally exploding token usage or scheduler load.
|
||||
|
||||
## Delivery model
|
||||
|
||||
Cron jobs can deliver to:
|
||||
|
||||
- origin chat
|
||||
- local files
|
||||
- platform home channels
|
||||
- explicit platform/chat IDs
|
||||
|
||||
## Locking
|
||||
|
||||
Hermes uses lock-based protections so overlapping scheduler ticks do not execute the same due-job batch twice.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Cron feature guide](../user-guide/features/cron.md)
|
||||
- [Gateway Internals](./gateway-internals.md)
|
||||
520
hermes_code/website/docs/developer-guide/environments.md
Normal file
520
hermes_code/website/docs/developer-guide/environments.md
Normal file
|
|
@ -0,0 +1,520 @@
|
|||
---
|
||||
sidebar_position: 5
|
||||
title: "Environments, Benchmarks & Data Generation"
|
||||
description: "Building RL training environments, running evaluation benchmarks, and generating SFT data with the Hermes-Agent Atropos integration"
|
||||
---
|
||||
|
||||
# Environments, Benchmarks & Data Generation
|
||||
|
||||
Hermes Agent includes a full environment framework that connects its tool-calling capabilities to the [Atropos](https://github.com/NousResearch/atropos) RL training framework. This enables three workflows:
|
||||
|
||||
1. **RL Training** — Train language models on multi-turn agentic tasks with GRPO
|
||||
2. **Benchmarks** — Evaluate models on standardised agentic benchmarks
|
||||
3. **Data Generation** — Generate SFT training data from agent rollouts
|
||||
|
||||
All three share the same core: an **environment** class that defines tasks, runs an agent loop, and scores the output.
|
||||
|
||||
:::info Repo environments vs RL training tools
|
||||
The Python environment framework documented here lives under the repo's `environments/` directory and is the implementation-level API for Hermes/Atropos integration. This is separate from the user-facing `rl_*` tools, which operate as an orchestration surface for remote RL training workflows.
|
||||
:::
|
||||
|
||||
:::tip Quick Links
|
||||
- **Want to run benchmarks?** Jump to [Available Benchmarks](#available-benchmarks)
|
||||
- **Want to train with RL?** See [RL Training Tools](/user-guide/features/rl-training) for the agent-driven interface, or [Running Environments](#running-environments) for manual execution
|
||||
- **Want to create a new environment?** See [Creating Environments](#creating-environments)
|
||||
:::
|
||||
|
||||
## Architecture
|
||||
|
||||
The environment system is built on a three-layer inheritance chain:
|
||||
|
||||
```mermaid
|
||||
classDiagram
|
||||
class BaseEnv {
|
||||
Server management
|
||||
Worker scheduling
|
||||
Wandb logging
|
||||
CLI: serve / process / evaluate
|
||||
}
|
||||
|
||||
class HermesAgentBaseEnv {
|
||||
Terminal backend configuration
|
||||
Tool resolution
|
||||
Agent loop engine
|
||||
ToolContext access
|
||||
}
|
||||
|
||||
class TerminalTestEnv {
|
||||
Stack testing
|
||||
}
|
||||
|
||||
class HermesSweEnv {
|
||||
SWE training
|
||||
}
|
||||
|
||||
class TerminalBench2EvalEnv {
|
||||
Benchmark evaluation
|
||||
}
|
||||
|
||||
class TBLiteEvalEnv {
|
||||
Fast benchmark
|
||||
}
|
||||
|
||||
class YCBenchEvalEnv {
|
||||
Long-horizon benchmark
|
||||
}
|
||||
|
||||
BaseEnv <|-- HermesAgentBaseEnv
|
||||
HermesAgentBaseEnv <|-- TerminalTestEnv
|
||||
HermesAgentBaseEnv <|-- HermesSweEnv
|
||||
HermesAgentBaseEnv <|-- TerminalBench2EvalEnv
|
||||
TerminalBench2EvalEnv <|-- TBLiteEvalEnv
|
||||
TerminalBench2EvalEnv <|-- YCBenchEvalEnv
|
||||
```
|
||||
|
||||
### BaseEnv (Atropos)
|
||||
|
||||
The foundation from `atroposlib`. Provides:
|
||||
- **Server management** — connects to OpenAI-compatible APIs (VLLM, SGLang, OpenRouter)
|
||||
- **Worker scheduling** — parallel rollout coordination
|
||||
- **Wandb integration** — metrics logging and rollout visualisation
|
||||
- **CLI interface** — three subcommands: `serve`, `process`, `evaluate`
|
||||
- **Eval logging** — `evaluate_log()` saves results to JSON + JSONL
|
||||
|
||||
### HermesAgentBaseEnv
|
||||
|
||||
The hermes-agent layer (`environments/hermes_base_env.py`). Adds:
|
||||
- **Terminal backend configuration** — sets `TERMINAL_ENV` for sandboxed execution (local, Docker, Modal, Daytona, SSH, Singularity)
|
||||
- **Tool resolution** — `_resolve_tools_for_group()` calls hermes-agent's `get_tool_definitions()` to get the right tool schemas based on enabled/disabled toolsets
|
||||
- **Agent loop integration** — `collect_trajectory()` runs `HermesAgentLoop` and scores the result
|
||||
- **Two-phase operation** — Phase 1 (OpenAI server) for eval/SFT, Phase 2 (VLLM ManagedServer) for full RL with logprobs
|
||||
- **Async safety patches** — monkey-patches Modal backend to work inside Atropos's event loop
|
||||
|
||||
### Concrete Environments
|
||||
|
||||
Your environment inherits from `HermesAgentBaseEnv` and implements five methods:
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `setup()` | Load dataset, initialise state |
|
||||
| `get_next_item()` | Return the next item for rollout |
|
||||
| `format_prompt(item)` | Convert an item into the user message |
|
||||
| `compute_reward(item, result, ctx)` | Score the rollout (0.0–1.0) |
|
||||
| `evaluate()` | Periodic evaluation logic |
|
||||
|
||||
## Core Components
|
||||
|
||||
### Agent Loop
|
||||
|
||||
`HermesAgentLoop` (`environments/agent_loop.py`) is the reusable multi-turn agent engine. It runs the same tool-calling pattern as hermes-agent's main loop:
|
||||
|
||||
1. Send messages + tool schemas to the API via `server.chat_completion()`
|
||||
2. If the response contains `tool_calls`, dispatch each via `handle_function_call()`
|
||||
3. Append tool results to the conversation, go back to step 1
|
||||
4. If no `tool_calls`, the agent is done
|
||||
|
||||
Tool calls execute in a thread pool (`ThreadPoolExecutor(128)`) so that async backends (Modal, Docker) don't deadlock inside Atropos's event loop.
|
||||
|
||||
Returns an `AgentResult`:
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class AgentResult:
|
||||
messages: List[Dict[str, Any]] # Full conversation history
|
||||
turns_used: int # Number of LLM calls made
|
||||
finished_naturally: bool # True if model stopped on its own
|
||||
reasoning_per_turn: List[Optional[str]] # Extracted reasoning content
|
||||
tool_errors: List[ToolError] # Errors encountered during tool dispatch
|
||||
managed_state: Optional[Dict] # VLLM ManagedServer state (Phase 2)
|
||||
```
|
||||
|
||||
### Tool Context
|
||||
|
||||
`ToolContext` (`environments/tool_context.py`) gives reward functions direct access to the **same sandbox** the model used during its rollout. The `task_id` scoping means all state (files, processes, browser tabs) is preserved.
|
||||
|
||||
```python
|
||||
async def compute_reward(self, item, result, ctx: ToolContext):
|
||||
# Run tests in the model's terminal sandbox
|
||||
test = ctx.terminal("pytest -v")
|
||||
if test["exit_code"] == 0:
|
||||
return 1.0
|
||||
|
||||
# Check if a file was created
|
||||
content = ctx.read_file("/workspace/solution.py")
|
||||
if content.get("content"):
|
||||
return 0.5
|
||||
|
||||
# Download files for local verification
|
||||
ctx.download_file("/remote/output.bin", "/local/output.bin")
|
||||
return 0.0
|
||||
```
|
||||
|
||||
Available methods:
|
||||
|
||||
| Category | Methods |
|
||||
|----------|---------|
|
||||
| **Terminal** | `terminal(command, timeout)` |
|
||||
| **Files** | `read_file(path)`, `write_file(path, content)`, `search(query, path)` |
|
||||
| **Transfers** | `upload_file()`, `upload_dir()`, `download_file()`, `download_dir()` |
|
||||
| **Web** | `web_search(query)`, `web_extract(urls)` |
|
||||
| **Browser** | `browser_navigate(url)`, `browser_snapshot()` |
|
||||
| **Generic** | `call_tool(name, args)` — escape hatch for any hermes-agent tool |
|
||||
| **Cleanup** | `cleanup()` — release all resources |
|
||||
|
||||
### Tool Call Parsers
|
||||
|
||||
For **Phase 2** (VLLM ManagedServer), the server returns raw text without structured tool calls. Client-side parsers in `environments/tool_call_parsers/` extract `tool_calls` from raw output:
|
||||
|
||||
```python
|
||||
from environments.tool_call_parsers import get_parser
|
||||
|
||||
parser = get_parser("hermes") # or "mistral", "llama3_json", "qwen", "deepseek_v3", etc.
|
||||
content, tool_calls = parser.parse(raw_model_output)
|
||||
```
|
||||
|
||||
Available parsers: `hermes`, `mistral`, `llama3_json`, `qwen`, `qwen3_coder`, `deepseek_v3`, `deepseek_v3_1`, `kimi_k2`, `longcat`, `glm45`, `glm47`.
|
||||
|
||||
In Phase 1 (OpenAI server type), parsers are not needed — the server handles tool call parsing natively.
|
||||
|
||||
## Available Benchmarks
|
||||
|
||||
### TerminalBench2
|
||||
|
||||
**89 challenging terminal tasks** with per-task Docker sandbox environments.
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| **What it tests** | Single-task coding/sysadmin ability |
|
||||
| **Scoring** | Binary pass/fail (test suite verification) |
|
||||
| **Sandbox** | Modal cloud sandboxes (per-task Docker images) |
|
||||
| **Tools** | `terminal` + `file` |
|
||||
| **Tasks** | 89 tasks across multiple categories |
|
||||
| **Cost** | ~$50–200 for full eval (parallel execution) |
|
||||
| **Time** | ~2–4 hours |
|
||||
|
||||
```bash
|
||||
python environments/benchmarks/terminalbench_2/terminalbench2_env.py evaluate \
|
||||
--config environments/benchmarks/terminalbench_2/default.yaml
|
||||
|
||||
# Run specific tasks
|
||||
python environments/benchmarks/terminalbench_2/terminalbench2_env.py evaluate \
|
||||
--config environments/benchmarks/terminalbench_2/default.yaml \
|
||||
--env.task_filter fix-git,git-multibranch
|
||||
```
|
||||
|
||||
Dataset: [NousResearch/terminal-bench-2](https://huggingface.co/datasets/NousResearch/terminal-bench-2) on HuggingFace.
|
||||
|
||||
### TBLite (OpenThoughts Terminal Bench Lite)
|
||||
|
||||
**100 difficulty-calibrated tasks** — a faster proxy for TerminalBench2.
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| **What it tests** | Same as TB2 (coding/sysadmin), calibrated difficulty tiers |
|
||||
| **Scoring** | Binary pass/fail |
|
||||
| **Sandbox** | Modal cloud sandboxes |
|
||||
| **Tools** | `terminal` + `file` |
|
||||
| **Tasks** | 100 tasks: Easy (40), Medium (26), Hard (26), Extreme (8) |
|
||||
| **Correlation** | r=0.911 with full TB2 |
|
||||
| **Speed** | 2.6–8× faster than TB2 |
|
||||
|
||||
```bash
|
||||
python environments/benchmarks/tblite/tblite_env.py evaluate \
|
||||
--config environments/benchmarks/tblite/default.yaml
|
||||
```
|
||||
|
||||
TBLite is a thin subclass of TerminalBench2 — only the dataset and timeouts differ. Created by the OpenThoughts Agent team (Snorkel AI + Bespoke Labs). Dataset: [NousResearch/openthoughts-tblite](https://huggingface.co/datasets/NousResearch/openthoughts-tblite).
|
||||
|
||||
### YC-Bench
|
||||
|
||||
**Long-horizon strategic benchmark** — the agent plays CEO of an AI startup.
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| **What it tests** | Multi-turn strategic coherence over hundreds of turns |
|
||||
| **Scoring** | Composite: `0.5 × survival + 0.5 × normalised_funds` |
|
||||
| **Sandbox** | Local terminal (no Modal needed) |
|
||||
| **Tools** | `terminal` only |
|
||||
| **Runs** | 9 default (3 presets × 3 seeds), sequential |
|
||||
| **Cost** | ~$50–200 for full eval |
|
||||
| **Time** | ~3–6 hours |
|
||||
|
||||
```bash
|
||||
# Install yc-bench (optional dependency)
|
||||
pip install "hermes-agent[yc-bench]"
|
||||
|
||||
# Run evaluation
|
||||
bash environments/benchmarks/yc_bench/run_eval.sh
|
||||
|
||||
# Or directly
|
||||
python environments/benchmarks/yc_bench/yc_bench_env.py evaluate \
|
||||
--config environments/benchmarks/yc_bench/default.yaml
|
||||
|
||||
# Quick single-preset test
|
||||
python environments/benchmarks/yc_bench/yc_bench_env.py evaluate \
|
||||
--config environments/benchmarks/yc_bench/default.yaml \
|
||||
--env.presets '["fast_test"]' --env.seeds '[1]'
|
||||
```
|
||||
|
||||
YC-Bench uses [collinear-ai/yc-bench](https://github.com/collinear-ai/yc-bench) — a deterministic simulation with 4 skill domains (research, inference, data_environment, training), prestige system, employee management, and financial pressure. Unlike TB2's per-task binary scoring, YC-Bench measures whether an agent can maintain coherent strategy over hundreds of compounding decisions.
|
||||
|
||||
## Training Environments
|
||||
|
||||
### TerminalTestEnv
|
||||
|
||||
A minimal self-contained environment with inline tasks (no external dataset). Used for **validating the full stack** end-to-end. Each task asks the model to create a file at a known path; the verifier checks the content.
|
||||
|
||||
```bash
|
||||
# Process mode (saves rollouts to JSONL, no training server needed)
|
||||
python environments/terminal_test_env/terminal_test_env.py process \
|
||||
--env.data_path_to_save_groups terminal_test_output.jsonl
|
||||
|
||||
# Serve mode (connects to Atropos API for RL training)
|
||||
python environments/terminal_test_env/terminal_test_env.py serve
|
||||
```
|
||||
|
||||
### HermesSweEnv
|
||||
|
||||
SWE-bench style training environment. The model gets a coding task, uses terminal + file + web tools to solve it, and the reward function runs tests in the same Modal sandbox.
|
||||
|
||||
```bash
|
||||
python environments/hermes_swe_env/hermes_swe_env.py serve \
|
||||
--openai.model_name YourModel \
|
||||
--env.dataset_name bigcode/humanevalpack \
|
||||
--env.terminal_backend modal
|
||||
```
|
||||
|
||||
## Running Environments
|
||||
|
||||
Every environment is a standalone Python script with three CLI subcommands:
|
||||
|
||||
### `evaluate` — Run a benchmark
|
||||
|
||||
For eval-only environments (benchmarks). Runs all items, computes metrics, logs to wandb.
|
||||
|
||||
```bash
|
||||
python environments/benchmarks/tblite/tblite_env.py evaluate \
|
||||
--config environments/benchmarks/tblite/default.yaml \
|
||||
--openai.model_name anthropic/claude-sonnet-4.6
|
||||
```
|
||||
|
||||
No training server or `run-api` needed. The environment handles everything.
|
||||
|
||||
### `process` — Generate SFT data
|
||||
|
||||
Runs rollouts and saves scored trajectories to JSONL. Useful for generating training data without a full RL loop.
|
||||
|
||||
```bash
|
||||
python environments/terminal_test_env/terminal_test_env.py process \
|
||||
--env.data_path_to_save_groups output.jsonl \
|
||||
--openai.model_name anthropic/claude-sonnet-4.6
|
||||
```
|
||||
|
||||
Output format: each line is a scored trajectory with the full conversation history, reward, and metadata.
|
||||
|
||||
### `serve` — Connect to Atropos for RL training
|
||||
|
||||
Connects the environment to a running Atropos API server (`run-api`). Used during live RL training.
|
||||
|
||||
```bash
|
||||
# Terminal 1: Start the Atropos API
|
||||
run-api
|
||||
|
||||
# Terminal 2: Start the environment
|
||||
python environments/hermes_swe_env/hermes_swe_env.py serve \
|
||||
--openai.model_name YourModel
|
||||
```
|
||||
|
||||
The environment receives items from Atropos, runs agent rollouts, computes rewards, and sends scored trajectories back for training.
|
||||
|
||||
## Two-Phase Operation
|
||||
|
||||
### Phase 1: OpenAI Server (Eval / SFT)
|
||||
|
||||
Uses `server.chat_completion()` with `tools=` parameter. The server (VLLM, SGLang, OpenRouter, OpenAI) handles tool call parsing natively. Returns `ChatCompletion` objects with structured `tool_calls`.
|
||||
|
||||
- **Use for**: evaluation, SFT data generation, benchmarks, testing
|
||||
- **Placeholder tokens** are created for the Atropos pipeline (since real token IDs aren't available from the OpenAI API)
|
||||
|
||||
### Phase 2: VLLM ManagedServer (Full RL)
|
||||
|
||||
Uses ManagedServer for exact token IDs + logprobs via `/generate`. A client-side [tool call parser](#tool-call-parsers) reconstructs structured `tool_calls` from raw output.
|
||||
|
||||
- **Use for**: full RL training with GRPO/PPO
|
||||
- **Real tokens**, masks, and logprobs flow through the pipeline
|
||||
- Set `tool_call_parser` in config to match your model's format (e.g., `"hermes"`, `"qwen"`, `"mistral"`)
|
||||
|
||||
## Creating Environments
|
||||
|
||||
### Training Environment
|
||||
|
||||
```python
|
||||
from environments.hermes_base_env import HermesAgentBaseEnv, HermesAgentEnvConfig
|
||||
from atroposlib.envs.server_handling.server_manager import APIServerConfig
|
||||
|
||||
class MyEnvConfig(HermesAgentEnvConfig):
|
||||
my_custom_field: str = "default_value"
|
||||
|
||||
class MyEnv(HermesAgentBaseEnv):
|
||||
name = "my-env"
|
||||
env_config_cls = MyEnvConfig
|
||||
|
||||
@classmethod
|
||||
def config_init(cls):
|
||||
env_config = MyEnvConfig(
|
||||
enabled_toolsets=["terminal", "file"],
|
||||
terminal_backend="modal",
|
||||
max_agent_turns=30,
|
||||
)
|
||||
server_configs = [APIServerConfig(
|
||||
base_url="https://openrouter.ai/api/v1",
|
||||
model_name="anthropic/claude-sonnet-4.6",
|
||||
server_type="openai",
|
||||
)]
|
||||
return env_config, server_configs
|
||||
|
||||
async def setup(self):
|
||||
from datasets import load_dataset
|
||||
self.dataset = list(load_dataset("my-dataset", split="train"))
|
||||
self.iter = 0
|
||||
|
||||
async def get_next_item(self):
|
||||
item = self.dataset[self.iter % len(self.dataset)]
|
||||
self.iter += 1
|
||||
return item
|
||||
|
||||
def format_prompt(self, item):
|
||||
return item["instruction"]
|
||||
|
||||
async def compute_reward(self, item, result, ctx):
|
||||
# ctx gives full tool access to the rollout's sandbox
|
||||
test = ctx.terminal("pytest -v")
|
||||
return 1.0 if test["exit_code"] == 0 else 0.0
|
||||
|
||||
async def evaluate(self, *args, **kwargs):
|
||||
# Periodic evaluation during training
|
||||
pass
|
||||
|
||||
if __name__ == "__main__":
|
||||
MyEnv.cli()
|
||||
```
|
||||
|
||||
### Eval-Only Benchmark
|
||||
|
||||
For benchmarks, follow the pattern used by TerminalBench2, TBLite, and YC-Bench:
|
||||
|
||||
1. **Create under** `environments/benchmarks/your-benchmark/`
|
||||
2. **Set eval-only config**: `eval_handling=STOP_TRAIN`, `steps_per_eval=1`, `total_steps=1`
|
||||
3. **Stub training methods**: `collect_trajectories()` returns `(None, [])`, `score()` returns `None`
|
||||
4. **Implement** `rollout_and_score_eval(eval_item)` — the per-item agent loop + scoring
|
||||
5. **Implement** `evaluate()` — orchestrates all runs, computes aggregate metrics
|
||||
6. **Add streaming JSONL** for crash-safe result persistence
|
||||
7. **Add cleanup**: `KeyboardInterrupt` handling, `cleanup_all_environments()`, `_tool_executor.shutdown()`
|
||||
8. **Run with** `evaluate` subcommand
|
||||
|
||||
See `environments/benchmarks/yc_bench/yc_bench_env.py` for a clean, well-documented reference implementation.
|
||||
|
||||
## Configuration Reference
|
||||
|
||||
### HermesAgentEnvConfig Fields
|
||||
|
||||
| Field | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `enabled_toolsets` | `List[str]` | `None` (all) | Which hermes toolsets to enable |
|
||||
| `disabled_toolsets` | `List[str]` | `None` | Toolsets to filter out |
|
||||
| `distribution` | `str` | `None` | Probabilistic toolset distribution name |
|
||||
| `max_agent_turns` | `int` | `30` | Max LLM calls per rollout |
|
||||
| `agent_temperature` | `float` | `1.0` | Sampling temperature |
|
||||
| `system_prompt` | `str` | `None` | System message for the agent |
|
||||
| `terminal_backend` | `str` | `"local"` | `local`, `docker`, `modal`, `daytona`, `ssh`, `singularity` |
|
||||
| `terminal_timeout` | `int` | `120` | Seconds per terminal command |
|
||||
| `terminal_lifetime` | `int` | `3600` | Max sandbox lifetime |
|
||||
| `dataset_name` | `str` | `None` | HuggingFace dataset identifier |
|
||||
| `tool_pool_size` | `int` | `128` | Thread pool size for tool execution |
|
||||
| `tool_call_parser` | `str` | `"hermes"` | Parser for Phase 2 raw output |
|
||||
| `extra_body` | `Dict` | `None` | Extra params for OpenAI API (e.g., OpenRouter provider prefs) |
|
||||
| `eval_handling` | `Enum` | `STOP_TRAIN` | `STOP_TRAIN`, `LIMIT_TRAIN`, `NONE` |
|
||||
|
||||
### YAML Configuration
|
||||
|
||||
Environments can be configured via YAML files passed with `--config`:
|
||||
|
||||
```yaml
|
||||
env:
|
||||
enabled_toolsets: ["terminal", "file"]
|
||||
max_agent_turns: 60
|
||||
max_token_length: 32000
|
||||
agent_temperature: 0.8
|
||||
terminal_backend: "modal"
|
||||
terminal_timeout: 300
|
||||
dataset_name: "NousResearch/terminal-bench-2"
|
||||
tokenizer_name: "NousResearch/Hermes-3-Llama-3.1-8B"
|
||||
use_wandb: true
|
||||
wandb_name: "my-benchmark"
|
||||
|
||||
openai:
|
||||
base_url: "https://openrouter.ai/api/v1"
|
||||
model_name: "anthropic/claude-sonnet-4.6"
|
||||
server_type: "openai"
|
||||
health_check: false
|
||||
```
|
||||
|
||||
YAML values override `config_init()` defaults. CLI arguments override YAML values:
|
||||
|
||||
```bash
|
||||
python my_env.py evaluate \
|
||||
--config my_config.yaml \
|
||||
--openai.model_name anthropic/claude-opus-4.6 # overrides YAML
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### For all environments
|
||||
|
||||
- Python >= 3.11
|
||||
- `atroposlib`: `pip install git+https://github.com/NousResearch/atropos.git`
|
||||
- An LLM API key (OpenRouter, OpenAI, or self-hosted VLLM/SGLang)
|
||||
|
||||
### For Modal-sandboxed benchmarks (TB2, TBLite)
|
||||
|
||||
- [Modal](https://modal.com) account and CLI: `pip install "hermes-agent[modal]"`
|
||||
- `MODAL_TOKEN_ID` and `MODAL_TOKEN_SECRET` environment variables
|
||||
|
||||
### For YC-Bench
|
||||
|
||||
- `pip install "hermes-agent[yc-bench]"` (installs the yc-bench CLI + SQLAlchemy)
|
||||
- No Modal needed — runs with local terminal backend
|
||||
|
||||
### For RL training
|
||||
|
||||
- `TINKER_API_KEY` — API key for the [Tinker](https://tinker.computer) training service
|
||||
- `WANDB_API_KEY` — for Weights & Biases metrics tracking
|
||||
- The `tinker-atropos` submodule (at `tinker-atropos/` in the repo)
|
||||
|
||||
See [RL Training](/user-guide/features/rl-training) for the agent-driven RL workflow.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
environments/
|
||||
├── hermes_base_env.py # Abstract base class (HermesAgentBaseEnv)
|
||||
├── agent_loop.py # Multi-turn agent engine (HermesAgentLoop)
|
||||
├── tool_context.py # Per-rollout tool access for reward functions
|
||||
├── patches.py # Async-safety patches for Modal backend
|
||||
│
|
||||
├── tool_call_parsers/ # Phase 2 client-side parsers
|
||||
│ ├── hermes_parser.py # Hermes/ChatML <tool_call> format
|
||||
│ ├── mistral_parser.py # Mistral [TOOL_CALLS] format
|
||||
│ ├── llama_parser.py # Llama 3 JSON tool calling
|
||||
│ ├── qwen_parser.py # Qwen format
|
||||
│ ├── deepseek_v3_parser.py # DeepSeek V3 format
|
||||
│ └── ... # + kimi_k2, longcat, glm45/47, etc.
|
||||
│
|
||||
├── terminal_test_env/ # Stack validation (inline tasks)
|
||||
├── hermes_swe_env/ # SWE-bench training environment
|
||||
│
|
||||
└── benchmarks/ # Evaluation benchmarks
|
||||
├── terminalbench_2/ # 89 terminal tasks, Modal sandboxes
|
||||
├── tblite/ # 100 calibrated tasks (fast TB2 proxy)
|
||||
└── yc_bench/ # Long-horizon strategic benchmark
|
||||
```
|
||||
190
hermes_code/website/docs/developer-guide/extending-the-cli.md
Normal file
190
hermes_code/website/docs/developer-guide/extending-the-cli.md
Normal file
|
|
@ -0,0 +1,190 @@
|
|||
---
|
||||
sidebar_position: 8
|
||||
title: "Extending the CLI"
|
||||
description: "Build wrapper CLIs that extend the Hermes TUI with custom widgets, keybindings, and layout changes"
|
||||
---
|
||||
|
||||
# Extending the CLI
|
||||
|
||||
Hermes exposes protected extension hooks on `HermesCLI` so wrapper CLIs can add widgets, keybindings, and layout customizations without overriding the 1000+ line `run()` method. This keeps your extension decoupled from internal changes.
|
||||
|
||||
## Extension points
|
||||
|
||||
There are five extension seams available:
|
||||
|
||||
| Hook | Purpose | Override when... |
|
||||
|------|---------|------------------|
|
||||
| `_get_extra_tui_widgets()` | Inject widgets into the layout | You need a persistent UI element (panel, status line, mini-player) |
|
||||
| `_register_extra_tui_keybindings(kb, *, input_area)` | Add keyboard shortcuts | You need hotkeys (toggle panels, transport controls, modal shortcuts) |
|
||||
| `_build_tui_layout_children(**widgets)` | Full control over widget ordering | You need to reorder or wrap existing widgets (rare) |
|
||||
| `process_command()` | Add custom slash commands | You need `/mycommand` handling (pre-existing hook) |
|
||||
| `_build_tui_style_dict()` | Custom prompt_toolkit styles | You need custom colors or styling (pre-existing hook) |
|
||||
|
||||
The first three are new protected hooks. The last two already existed.
|
||||
|
||||
## Quick start: a wrapper CLI
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
"""my_cli.py — Example wrapper CLI that extends Hermes."""
|
||||
|
||||
from cli import HermesCLI
|
||||
from prompt_toolkit.layout import FormattedTextControl, Window
|
||||
from prompt_toolkit.filters import Condition
|
||||
|
||||
|
||||
class MyCLI(HermesCLI):
|
||||
|
||||
def __init__(self, **kwargs):
|
||||
super().__init__(**kwargs)
|
||||
self._panel_visible = False
|
||||
|
||||
def _get_extra_tui_widgets(self):
|
||||
"""Add a toggleable info panel above the status bar."""
|
||||
cli_ref = self
|
||||
return [
|
||||
Window(
|
||||
FormattedTextControl(lambda: "📊 My custom panel content"),
|
||||
height=1,
|
||||
filter=Condition(lambda: cli_ref._panel_visible),
|
||||
),
|
||||
]
|
||||
|
||||
def _register_extra_tui_keybindings(self, kb, *, input_area):
|
||||
"""F2 toggles the custom panel."""
|
||||
cli_ref = self
|
||||
|
||||
@kb.add("f2")
|
||||
def _toggle_panel(event):
|
||||
cli_ref._panel_visible = not cli_ref._panel_visible
|
||||
|
||||
def process_command(self, cmd: str) -> bool:
|
||||
"""Add a /panel slash command."""
|
||||
if cmd.strip().lower() == "/panel":
|
||||
self._panel_visible = not self._panel_visible
|
||||
state = "visible" if self._panel_visible else "hidden"
|
||||
print(f"Panel is now {state}")
|
||||
return True
|
||||
return super().process_command(cmd)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
cli = MyCLI()
|
||||
cli.run()
|
||||
```
|
||||
|
||||
Run it:
|
||||
|
||||
```bash
|
||||
cd ~/.hermes/hermes-agent
|
||||
source .venv/bin/activate
|
||||
python my_cli.py
|
||||
```
|
||||
|
||||
## Hook reference
|
||||
|
||||
### `_get_extra_tui_widgets()`
|
||||
|
||||
Returns a list of prompt_toolkit widgets to insert into the TUI layout. Widgets appear **between the spacer and the status bar** — above the input area but below the main output.
|
||||
|
||||
```python
|
||||
def _get_extra_tui_widgets(self) -> list:
|
||||
return [] # default: no extra widgets
|
||||
```
|
||||
|
||||
Each widget should be a prompt_toolkit container (e.g., `Window`, `ConditionalContainer`, `HSplit`). Use `ConditionalContainer` or `filter=Condition(...)` to make widgets toggleable.
|
||||
|
||||
```python
|
||||
from prompt_toolkit.layout import ConditionalContainer, Window, FormattedTextControl
|
||||
from prompt_toolkit.filters import Condition
|
||||
|
||||
def _get_extra_tui_widgets(self):
|
||||
return [
|
||||
ConditionalContainer(
|
||||
Window(FormattedTextControl("Status: connected"), height=1),
|
||||
filter=Condition(lambda: self._show_status),
|
||||
),
|
||||
]
|
||||
```
|
||||
|
||||
### `_register_extra_tui_keybindings(kb, *, input_area)`
|
||||
|
||||
Called after Hermes registers its own keybindings and before the layout is built. Add your keybindings to `kb`.
|
||||
|
||||
```python
|
||||
def _register_extra_tui_keybindings(self, kb, *, input_area):
|
||||
pass # default: no extra keybindings
|
||||
```
|
||||
|
||||
Parameters:
|
||||
- **`kb`** — The `KeyBindings` instance for the prompt_toolkit application
|
||||
- **`input_area`** — The main `TextArea` widget, if you need to read or manipulate user input
|
||||
|
||||
```python
|
||||
def _register_extra_tui_keybindings(self, kb, *, input_area):
|
||||
cli_ref = self
|
||||
|
||||
@kb.add("f3")
|
||||
def _clear_input(event):
|
||||
input_area.text = ""
|
||||
|
||||
@kb.add("f4")
|
||||
def _insert_template(event):
|
||||
input_area.text = "/search "
|
||||
```
|
||||
|
||||
**Avoid conflicts** with built-in keybindings: `Enter` (submit), `Escape Enter` (newline), `Ctrl-C` (interrupt), `Ctrl-D` (exit), `Tab` (auto-suggest accept). Function keys F2+ and Ctrl-combinations are generally safe.
|
||||
|
||||
### `_build_tui_layout_children(**widgets)`
|
||||
|
||||
Override this only when you need full control over widget ordering. Most extensions should use `_get_extra_tui_widgets()` instead.
|
||||
|
||||
```python
|
||||
def _build_tui_layout_children(self, *, sudo_widget, secret_widget,
|
||||
approval_widget, clarify_widget, spinner_widget, spacer,
|
||||
status_bar, input_rule_top, image_bar, input_area,
|
||||
input_rule_bot, voice_status_bar, completions_menu) -> list:
|
||||
```
|
||||
|
||||
The default implementation returns:
|
||||
|
||||
```python
|
||||
[
|
||||
Window(height=0), # anchor
|
||||
sudo_widget, # sudo password prompt (conditional)
|
||||
secret_widget, # secret input prompt (conditional)
|
||||
approval_widget, # dangerous command approval (conditional)
|
||||
clarify_widget, # clarify question UI (conditional)
|
||||
spinner_widget, # thinking spinner (conditional)
|
||||
spacer, # fills remaining vertical space
|
||||
*self._get_extra_tui_widgets(), # YOUR WIDGETS GO HERE
|
||||
status_bar, # model/token/context status line
|
||||
input_rule_top, # ─── border above input
|
||||
image_bar, # attached images indicator
|
||||
input_area, # user text input
|
||||
input_rule_bot, # ─── border below input
|
||||
voice_status_bar, # voice mode status (conditional)
|
||||
completions_menu, # autocomplete dropdown
|
||||
]
|
||||
```
|
||||
|
||||
## Layout diagram
|
||||
|
||||
The default layout from top to bottom:
|
||||
|
||||
1. **Output area** — scrolling conversation history
|
||||
2. **Spacer**
|
||||
3. **Extra widgets** — from `_get_extra_tui_widgets()`
|
||||
4. **Status bar** — model, context %, elapsed time
|
||||
5. **Image bar** — attached image count
|
||||
6. **Input area** — user prompt
|
||||
7. **Voice status** — recording indicator
|
||||
8. **Completions menu** — autocomplete suggestions
|
||||
|
||||
## Tips
|
||||
|
||||
- **Invalidate the display** after state changes: call `self._invalidate()` to trigger a prompt_toolkit redraw.
|
||||
- **Access agent state**: `self.agent`, `self.model`, `self.conversation_history` are all available.
|
||||
- **Custom styles**: Override `_build_tui_style_dict()` and add entries for your custom style classes.
|
||||
- **Slash commands**: Override `process_command()`, handle your commands, and call `super().process_command(cmd)` for everything else.
|
||||
- **Don't override `run()`** unless absolutely necessary — the extension hooks exist specifically to avoid that coupling.
|
||||
121
hermes_code/website/docs/developer-guide/gateway-internals.md
Normal file
121
hermes_code/website/docs/developer-guide/gateway-internals.md
Normal file
|
|
@ -0,0 +1,121 @@
|
|||
---
|
||||
sidebar_position: 7
|
||||
title: "Gateway Internals"
|
||||
description: "How the messaging gateway boots, authorizes users, routes sessions, and delivers messages"
|
||||
---
|
||||
|
||||
# Gateway Internals
|
||||
|
||||
The messaging gateway is the long-running process that connects Hermes to external platforms.
|
||||
|
||||
Key files:
|
||||
|
||||
- `gateway/run.py`
|
||||
- `gateway/config.py`
|
||||
- `gateway/session.py`
|
||||
- `gateway/delivery.py`
|
||||
- `gateway/pairing.py`
|
||||
- `gateway/channel_directory.py`
|
||||
- `gateway/hooks.py`
|
||||
- `gateway/mirror.py`
|
||||
- `gateway/platforms/*`
|
||||
|
||||
## Core responsibilities
|
||||
|
||||
The gateway process is responsible for:
|
||||
|
||||
- loading configuration from `.env`, `config.yaml`, and `gateway.json`
|
||||
- starting platform adapters
|
||||
- authorizing users
|
||||
- routing incoming events to sessions
|
||||
- maintaining per-chat session continuity
|
||||
- dispatching messages to `AIAgent`
|
||||
- running cron ticks and background maintenance tasks
|
||||
- mirroring/proactively delivering output to configured channels
|
||||
|
||||
## Config sources
|
||||
|
||||
The gateway has a multi-source config model:
|
||||
|
||||
- environment variables
|
||||
- `~/.hermes/gateway.json`
|
||||
- selected bridged values from `~/.hermes/config.yaml`
|
||||
|
||||
## Session routing
|
||||
|
||||
`gateway/session.py` and `GatewayRunner` cooperate to map incoming messages to active session IDs.
|
||||
|
||||
Session keying can depend on:
|
||||
|
||||
- platform
|
||||
- user/chat identity
|
||||
- thread/topic identity
|
||||
- special platform-specific routing behavior
|
||||
|
||||
## Authorization layers
|
||||
|
||||
The gateway can authorize through:
|
||||
|
||||
- platform allowlists
|
||||
- gateway-wide allowlists
|
||||
- DM pairing flows
|
||||
- explicit allow-all settings
|
||||
|
||||
Pairing support is implemented in `gateway/pairing.py`.
|
||||
|
||||
## Delivery path
|
||||
|
||||
Outgoing deliveries are handled by `gateway/delivery.py`, which knows how to:
|
||||
|
||||
- deliver to a home channel
|
||||
- resolve explicit targets
|
||||
- mirror some remote deliveries back into local history/session tracking
|
||||
|
||||
## Hooks
|
||||
|
||||
Gateway events emit hook callbacks through `gateway/hooks.py`. Hooks are local trusted Python code and can observe or extend gateway lifecycle events.
|
||||
|
||||
## Background maintenance
|
||||
|
||||
The gateway also runs maintenance tasks such as:
|
||||
|
||||
- cron ticking
|
||||
- cache refreshes
|
||||
- session expiry checks
|
||||
- proactive memory flush before reset/expiry
|
||||
|
||||
## Honcho interaction
|
||||
|
||||
When Honcho is enabled, the gateway keeps persistent Honcho managers aligned with session lifetimes and platform-specific session keys.
|
||||
|
||||
### Session routing
|
||||
|
||||
Honcho tools (`honcho_profile`, `honcho_search`, `honcho_context`, `honcho_conclude`) need to execute against the correct user's Honcho session. In a multi-user gateway, the process-global module state in `tools/honcho_tools.py` is insufficient — multiple sessions may be active concurrently.
|
||||
|
||||
The solution threads session context through the call chain:
|
||||
|
||||
```
|
||||
AIAgent._invoke_tool()
|
||||
→ handle_function_call(honcho_manager=..., honcho_session_key=...)
|
||||
→ registry.dispatch(**kwargs)
|
||||
→ _handle_honcho_*(args, **kw)
|
||||
→ _resolve_session_context(**kw) # prefers explicit kwargs over module globals
|
||||
```
|
||||
|
||||
`_resolve_session_context()` in `honcho_tools.py` checks for `honcho_manager` and `honcho_session_key` in the kwargs first, falling back to the module-global `_session_manager` / `_session_key` for CLI mode where there's only one session.
|
||||
|
||||
### Memory flush lifecycle
|
||||
|
||||
When a session is reset, resumed, or expires, the gateway flushes memories before discarding context. The flush creates a temporary `AIAgent` with:
|
||||
|
||||
- `session_id` set to the old session's ID (so transcripts load correctly)
|
||||
- `honcho_session_key` set to the gateway session key (so Honcho writes go to the right place)
|
||||
- `sync_honcho=False` passed to `run_conversation()` (so the synthetic flush turn doesn't write back to Honcho's conversation history)
|
||||
|
||||
After the flush completes, any queued Honcho writes are drained and the gateway-level Honcho manager is shut down for that session key.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Session Storage](./session-storage.md)
|
||||
- [Cron Internals](./cron-internals.md)
|
||||
- [ACP Internals](./acp-internals.md)
|
||||
89
hermes_code/website/docs/developer-guide/prompt-assembly.md
Normal file
89
hermes_code/website/docs/developer-guide/prompt-assembly.md
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
---
|
||||
sidebar_position: 5
|
||||
title: "Prompt Assembly"
|
||||
description: "How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers"
|
||||
---
|
||||
|
||||
# Prompt Assembly
|
||||
|
||||
Hermes deliberately separates:
|
||||
|
||||
- **cached system prompt state**
|
||||
- **ephemeral API-call-time additions**
|
||||
|
||||
This is one of the most important design choices in the project because it affects:
|
||||
|
||||
- token usage
|
||||
- prompt caching effectiveness
|
||||
- session continuity
|
||||
- memory correctness
|
||||
|
||||
Primary files:
|
||||
|
||||
- `run_agent.py`
|
||||
- `agent/prompt_builder.py`
|
||||
- `tools/memory_tool.py`
|
||||
|
||||
## Cached system prompt layers
|
||||
|
||||
The cached system prompt is assembled in roughly this order:
|
||||
|
||||
1. agent identity — `SOUL.md` from `HERMES_HOME` when available, otherwise falls back to `DEFAULT_AGENT_IDENTITY` in `prompt_builder.py`
|
||||
2. tool-aware behavior guidance
|
||||
3. Honcho static block (when active)
|
||||
4. optional system message
|
||||
5. frozen MEMORY snapshot
|
||||
6. frozen USER profile snapshot
|
||||
7. skills index
|
||||
8. context files (`AGENTS.md`, `.cursorrules`, `.cursor/rules/*.mdc`) — SOUL.md is **not** included here when it was already loaded as the identity in step 1
|
||||
9. timestamp / optional session ID
|
||||
10. platform hint
|
||||
|
||||
When `skip_context_files` is set (e.g., subagent delegation), SOUL.md is not loaded and the hardcoded `DEFAULT_AGENT_IDENTITY` is used instead.
|
||||
|
||||
## API-call-time-only layers
|
||||
|
||||
These are intentionally *not* persisted as part of the cached system prompt:
|
||||
|
||||
- `ephemeral_system_prompt`
|
||||
- prefill messages
|
||||
- gateway-derived session context overlays
|
||||
- later-turn Honcho recall injected into the current-turn user message
|
||||
|
||||
This separation keeps the stable prefix stable for caching.
|
||||
|
||||
## Memory snapshots
|
||||
|
||||
Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs.
|
||||
|
||||
## Context files
|
||||
|
||||
`agent/prompt_builder.py` scans and sanitizes project context files using a **priority system** — only one type is loaded (first match wins):
|
||||
|
||||
1. `.hermes.md` / `HERMES.md` (walks to git root)
|
||||
2. `AGENTS.md` (recursive directory walk)
|
||||
3. `CLAUDE.md` (CWD only)
|
||||
4. `.cursorrules` / `.cursor/rules/*.mdc` (CWD only)
|
||||
|
||||
`SOUL.md` is loaded separately via `load_soul_md()` for the identity slot. When it loads successfully, `build_context_files_prompt(skip_soul=True)` prevents it from appearing twice.
|
||||
|
||||
Long files are truncated before injection.
|
||||
|
||||
## Skills index
|
||||
|
||||
The skills system contributes a compact skills index to the prompt when skills tooling is available.
|
||||
|
||||
## Why prompt assembly is split this way
|
||||
|
||||
The architecture is intentionally optimized to:
|
||||
|
||||
- preserve provider-side prompt caching
|
||||
- avoid mutating history unnecessarily
|
||||
- keep memory semantics understandable
|
||||
- let gateway/ACP/CLI add context without poisoning persistent prompt state
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
- [Session Storage](./session-storage.md)
|
||||
- [Gateway Internals](./gateway-internals.md)
|
||||
186
hermes_code/website/docs/developer-guide/provider-runtime.md
Normal file
186
hermes_code/website/docs/developer-guide/provider-runtime.md
Normal file
|
|
@ -0,0 +1,186 @@
|
|||
---
|
||||
sidebar_position: 4
|
||||
title: "Provider Runtime Resolution"
|
||||
description: "How Hermes resolves providers, credentials, API modes, and auxiliary models at runtime"
|
||||
---
|
||||
|
||||
# Provider Runtime Resolution
|
||||
|
||||
Hermes has a shared provider runtime resolver used across:
|
||||
|
||||
- CLI
|
||||
- gateway
|
||||
- cron jobs
|
||||
- ACP
|
||||
- auxiliary model calls
|
||||
|
||||
Primary implementation:
|
||||
|
||||
- `hermes_cli/runtime_provider.py` — credential resolution, `_resolve_custom_runtime()`
|
||||
- `hermes_cli/auth.py` — provider registry, `resolve_provider()`
|
||||
- `hermes_cli/model_switch.py` — shared `/model` switch pipeline (CLI + gateway)
|
||||
- `agent/auxiliary_client.py` — auxiliary model routing
|
||||
|
||||
If you are trying to add a new first-class inference provider, read [Adding Providers](./adding-providers.md) alongside this page.
|
||||
|
||||
## Resolution precedence
|
||||
|
||||
At a high level, provider resolution uses:
|
||||
|
||||
1. explicit CLI/runtime request
|
||||
2. `config.yaml` model/provider config
|
||||
3. environment variables
|
||||
4. provider-specific defaults or auto resolution
|
||||
|
||||
That ordering matters because Hermes treats the saved model/provider choice as the source of truth for normal runs. This prevents a stale shell export from silently overriding the endpoint a user last selected in `hermes model`.
|
||||
|
||||
## Providers
|
||||
|
||||
Current provider families include:
|
||||
|
||||
- AI Gateway (Vercel)
|
||||
- OpenRouter
|
||||
- Nous Portal
|
||||
- OpenAI Codex
|
||||
- Anthropic (native)
|
||||
- Z.AI
|
||||
- Kimi / Moonshot
|
||||
- MiniMax
|
||||
- MiniMax China
|
||||
- Custom (`provider: custom`) — first-class provider for any OpenAI-compatible endpoint
|
||||
- Named custom providers (`custom_providers` list in config.yaml)
|
||||
|
||||
## Output of runtime resolution
|
||||
|
||||
The runtime resolver returns data such as:
|
||||
|
||||
- `provider`
|
||||
- `api_mode`
|
||||
- `base_url`
|
||||
- `api_key`
|
||||
- `source`
|
||||
- provider-specific metadata like expiry/refresh info
|
||||
|
||||
## Why this matters
|
||||
|
||||
This resolver is the main reason Hermes can share auth/runtime logic between:
|
||||
|
||||
- `hermes chat`
|
||||
- gateway message handling
|
||||
- cron jobs running in fresh sessions
|
||||
- ACP editor sessions
|
||||
- auxiliary model tasks
|
||||
|
||||
## AI Gateway
|
||||
|
||||
Set `AI_GATEWAY_API_KEY` in `~/.hermes/.env` and run with `--provider ai-gateway`. Hermes fetches available models from the gateway's `/models` endpoint, filtering to language models with tool-use support.
|
||||
|
||||
## OpenRouter, AI Gateway, and custom OpenAI-compatible base URLs
|
||||
|
||||
Hermes contains logic to avoid leaking the wrong API key to a custom endpoint when multiple provider keys exist (e.g. `OPENROUTER_API_KEY`, `AI_GATEWAY_API_KEY`, and `OPENAI_API_KEY`).
|
||||
|
||||
Each provider's API key is scoped to its own base URL:
|
||||
|
||||
- `OPENROUTER_API_KEY` is only sent to `openrouter.ai` endpoints
|
||||
- `AI_GATEWAY_API_KEY` is only sent to `ai-gateway.vercel.sh` endpoints
|
||||
- `OPENAI_API_KEY` is used for custom endpoints and as a fallback
|
||||
|
||||
Hermes also distinguishes between:
|
||||
|
||||
- a real custom endpoint selected by the user
|
||||
- the OpenRouter fallback path used when no custom endpoint is configured
|
||||
|
||||
That distinction is especially important for:
|
||||
|
||||
- local model servers
|
||||
- non-OpenRouter/non-AI Gateway OpenAI-compatible APIs
|
||||
- switching providers without re-running setup
|
||||
- config-saved custom endpoints that should keep working even when `OPENAI_BASE_URL` is not exported in the current shell
|
||||
|
||||
## Native Anthropic path
|
||||
|
||||
Anthropic is not just "via OpenRouter" anymore.
|
||||
|
||||
When provider resolution selects `anthropic`, Hermes uses:
|
||||
|
||||
- `api_mode = anthropic_messages`
|
||||
- the native Anthropic Messages API
|
||||
- `agent/anthropic_adapter.py` for translation
|
||||
|
||||
Credential resolution for native Anthropic now prefers refreshable Claude Code credentials over copied env tokens when both are present. In practice that means:
|
||||
|
||||
- Claude Code credential files are treated as the preferred source when they include refreshable auth
|
||||
- manual `ANTHROPIC_TOKEN` / `CLAUDE_CODE_OAUTH_TOKEN` values still work as explicit overrides
|
||||
- Hermes preflights Anthropic credential refresh before native Messages API calls
|
||||
- Hermes still retries once on a 401 after rebuilding the Anthropic client, as a fallback path
|
||||
|
||||
## OpenAI Codex path
|
||||
|
||||
Codex uses a separate Responses API path:
|
||||
|
||||
- `api_mode = codex_responses`
|
||||
- dedicated credential resolution and auth store support
|
||||
|
||||
## Auxiliary model routing
|
||||
|
||||
Auxiliary tasks such as:
|
||||
|
||||
- vision
|
||||
- web extraction summarization
|
||||
- context compression summaries
|
||||
- session search summarization
|
||||
- skills hub operations
|
||||
- MCP helper operations
|
||||
- memory flushes
|
||||
|
||||
can use their own provider/model routing rather than the main conversational model.
|
||||
|
||||
When an auxiliary task is configured with provider `main`, Hermes resolves that through the same shared runtime path as normal chat. In practice that means:
|
||||
|
||||
- env-driven custom endpoints still work
|
||||
- custom endpoints saved via `hermes model` / `config.yaml` also work
|
||||
- auxiliary routing can tell the difference between a real saved custom endpoint and the OpenRouter fallback
|
||||
|
||||
## Fallback models
|
||||
|
||||
Hermes supports a configured fallback model/provider pair, allowing runtime failover when the primary model encounters errors.
|
||||
|
||||
### How it works internally
|
||||
|
||||
1. **Storage**: `AIAgent.__init__` stores the `fallback_model` dict and sets `_fallback_activated = False`.
|
||||
|
||||
2. **Trigger points**: `_try_activate_fallback()` is called from three places in the main retry loop in `run_agent.py`:
|
||||
- After max retries on invalid API responses (None choices, missing content)
|
||||
- On non-retryable client errors (HTTP 401, 403, 404)
|
||||
- After max retries on transient errors (HTTP 429, 500, 502, 503)
|
||||
|
||||
3. **Activation flow** (`_try_activate_fallback`):
|
||||
- Returns `False` immediately if already activated or not configured
|
||||
- Calls `resolve_provider_client()` from `auxiliary_client.py` to build a new client with proper auth
|
||||
- Determines `api_mode`: `codex_responses` for openai-codex, `anthropic_messages` for anthropic, `chat_completions` for everything else
|
||||
- Swaps in-place: `self.model`, `self.provider`, `self.base_url`, `self.api_mode`, `self.client`, `self._client_kwargs`
|
||||
- For anthropic fallback: builds a native Anthropic client instead of OpenAI-compatible
|
||||
- Re-evaluates prompt caching (enabled for Claude models on OpenRouter)
|
||||
- Sets `_fallback_activated = True` — prevents firing again
|
||||
- Resets retry count to 0 and continues the loop
|
||||
|
||||
4. **Config flow**:
|
||||
- CLI: `cli.py` reads `CLI_CONFIG["fallback_model"]` → passes to `AIAgent(fallback_model=...)`
|
||||
- Gateway: `gateway/run.py._load_fallback_model()` reads `config.yaml` → passes to `AIAgent`
|
||||
- Validation: both `provider` and `model` keys must be non-empty, or fallback is disabled
|
||||
|
||||
### What does NOT support fallback
|
||||
|
||||
- **Subagent delegation** (`tools/delegate_tool.py`): subagents inherit the parent's provider but not the fallback config
|
||||
- **Cron jobs** (`cron/`): run with a fixed provider, no fallback mechanism
|
||||
- **Auxiliary tasks**: use their own independent provider auto-detection chain (see Auxiliary model routing above)
|
||||
|
||||
### Test coverage
|
||||
|
||||
See `tests/test_fallback_model.py` for comprehensive tests covering all supported providers, one-shot semantics, and edge cases.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Agent Loop Internals](./agent-loop.md)
|
||||
- [ACP Internals](./acp-internals.md)
|
||||
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
66
hermes_code/website/docs/developer-guide/session-storage.md
Normal file
66
hermes_code/website/docs/developer-guide/session-storage.md
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
---
|
||||
sidebar_position: 8
|
||||
title: "Session Storage"
|
||||
description: "How Hermes stores sessions in SQLite, maintains lineage, and exposes recall/search"
|
||||
---
|
||||
|
||||
# Session Storage
|
||||
|
||||
Hermes uses a SQLite-backed session store as the main source of truth for historical conversation state.
|
||||
|
||||
Primary files:
|
||||
|
||||
- `hermes_state.py`
|
||||
- `gateway/session.py`
|
||||
- `tools/session_search_tool.py`
|
||||
|
||||
## Main database
|
||||
|
||||
The primary store lives at:
|
||||
|
||||
```text
|
||||
~/.hermes/state.db
|
||||
```
|
||||
|
||||
It contains:
|
||||
|
||||
- sessions
|
||||
- messages
|
||||
- metadata such as token counts and titles
|
||||
- lineage relationships
|
||||
- full-text search indexes
|
||||
|
||||
## What is stored per session
|
||||
|
||||
Examples of important session metadata:
|
||||
|
||||
- session ID
|
||||
- source/platform
|
||||
- title
|
||||
- created/updated timestamps
|
||||
- token counts
|
||||
- tool call counts
|
||||
- stored system prompt snapshot
|
||||
- parent session ID after compression splits
|
||||
|
||||
## Lineage
|
||||
|
||||
When Hermes compresses a conversation, it can continue in a new session ID while preserving ancestry via `parent_session_id`.
|
||||
|
||||
This means resuming/searching can follow session families instead of treating each compressed shard as unrelated.
|
||||
|
||||
## Gateway vs CLI persistence
|
||||
|
||||
- CLI uses the state DB directly for resume/history/search
|
||||
- gateway keeps active-session mappings and may also maintain additional platform transcript/state files
|
||||
- some legacy JSON/JSONL artifacts still exist for compatibility, but SQLite is the main historical store
|
||||
|
||||
## Session search
|
||||
|
||||
The `session_search` tool uses the session DB's search features to retrieve and summarize relevant past work.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Gateway Internals](./gateway-internals.md)
|
||||
- [Prompt Assembly](./prompt-assembly.md)
|
||||
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
65
hermes_code/website/docs/developer-guide/tools-runtime.md
Normal file
65
hermes_code/website/docs/developer-guide/tools-runtime.md
Normal file
|
|
@ -0,0 +1,65 @@
|
|||
---
|
||||
sidebar_position: 9
|
||||
title: "Tools Runtime"
|
||||
description: "Runtime behavior of the tool registry, toolsets, dispatch, and terminal environments"
|
||||
---
|
||||
|
||||
# Tools Runtime
|
||||
|
||||
Hermes tools are self-registering functions grouped into toolsets and executed through a central registry/dispatch system.
|
||||
|
||||
Primary files:
|
||||
|
||||
- `tools/registry.py`
|
||||
- `model_tools.py`
|
||||
- `toolsets.py`
|
||||
- `tools/terminal_tool.py`
|
||||
- `tools/environments/*`
|
||||
|
||||
## Tool registration model
|
||||
|
||||
Each tool module calls `registry.register(...)` at import time.
|
||||
|
||||
`model_tools.py` is responsible for importing/discovering tool modules and building the schema list used by the model.
|
||||
|
||||
## Toolset resolution
|
||||
|
||||
Toolsets are named bundles of tools. Hermes resolves them through:
|
||||
|
||||
- explicit enabled/disabled toolset lists
|
||||
- platform presets (`hermes-cli`, `hermes-telegram`, etc.)
|
||||
- dynamic MCP toolsets
|
||||
- curated special-purpose sets like `hermes-acp`
|
||||
|
||||
## Dispatch
|
||||
|
||||
At runtime, tools are dispatched through the central registry, with agent-loop exceptions for some agent-level tools such as memory/todo/session-search handling.
|
||||
|
||||
## Terminal/runtime environments
|
||||
|
||||
The terminal system supports multiple backends:
|
||||
|
||||
- local
|
||||
- docker
|
||||
- ssh
|
||||
- singularity
|
||||
- modal
|
||||
- daytona
|
||||
|
||||
It also supports:
|
||||
|
||||
- per-task cwd overrides
|
||||
- background process management
|
||||
- PTY mode
|
||||
- approval callbacks for dangerous commands
|
||||
|
||||
## Concurrency
|
||||
|
||||
Tool calls may execute sequentially or concurrently depending on the tool mix and interaction requirements.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Toolsets Reference](../reference/toolsets-reference.md)
|
||||
- [Built-in Tools Reference](../reference/tools-reference.md)
|
||||
- [Agent Loop Internals](./agent-loop.md)
|
||||
- [ACP Internals](./acp-internals.md)
|
||||
|
|
@ -0,0 +1,56 @@
|
|||
---
|
||||
sidebar_position: 10
|
||||
title: "Trajectories & Training Format"
|
||||
description: "How Hermes saves trajectories, normalizes tool calls, and produces training-friendly outputs"
|
||||
---
|
||||
|
||||
# Trajectories & Training Format
|
||||
|
||||
Hermes can save conversation trajectories for training, evaluation, and batch data generation workflows.
|
||||
|
||||
Primary files:
|
||||
|
||||
- `agent/trajectory.py`
|
||||
- `run_agent.py`
|
||||
- `batch_runner.py`
|
||||
- `trajectory_compressor.py`
|
||||
|
||||
## What trajectories are for
|
||||
|
||||
Trajectory outputs are used for:
|
||||
|
||||
- SFT data generation
|
||||
- debugging agent behavior
|
||||
- benchmark/evaluation artifact capture
|
||||
- post-processing and compression pipelines
|
||||
|
||||
## Normalization strategy
|
||||
|
||||
Hermes converts live conversation structure into a training-friendly format.
|
||||
|
||||
Important behaviors include:
|
||||
|
||||
- representing reasoning in explicit markup
|
||||
- converting tool calls into structured XML-like regions for dataset compatibility
|
||||
- grouping tool outputs appropriately
|
||||
- separating successful and failed trajectories
|
||||
|
||||
## Persistence boundaries
|
||||
|
||||
Trajectory files do **not** blindly mirror all runtime prompt state.
|
||||
|
||||
Some prompt-time-only layers are intentionally excluded from persisted trajectory content so datasets are cleaner and less environment-specific.
|
||||
|
||||
## Batch runner
|
||||
|
||||
`batch_runner.py` emits richer metadata than single-session trajectory saving, including:
|
||||
|
||||
- model/provider metadata
|
||||
- toolset info
|
||||
- partial/failure markers
|
||||
- tool statistics
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Environments, Benchmarks & Data Generation](./environments.md)
|
||||
- [Agent Loop Internals](./agent-loop.md)
|
||||
8
hermes_code/website/docs/getting-started/_category_.json
Normal file
8
hermes_code/website/docs/getting-started/_category_.json
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
{
|
||||
"label": "Getting Started",
|
||||
"position": 1,
|
||||
"link": {
|
||||
"type": "generated-index",
|
||||
"description": "Get up and running with Hermes Agent in minutes."
|
||||
}
|
||||
}
|
||||
266
hermes_code/website/docs/getting-started/installation.md
Normal file
266
hermes_code/website/docs/getting-started/installation.md
Normal file
|
|
@ -0,0 +1,266 @@
|
|||
---
|
||||
sidebar_position: 2
|
||||
title: "Installation"
|
||||
description: "Install Hermes Agent on Linux, macOS, or WSL2"
|
||||
---
|
||||
|
||||
# Installation
|
||||
|
||||
Get Hermes Agent up and running in under two minutes with the one-line installer, or follow the manual steps for full control.
|
||||
|
||||
## Quick Install
|
||||
|
||||
### Linux / macOS / WSL2
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
|
||||
```
|
||||
|
||||
:::warning Windows
|
||||
Native Windows is **not supported**. Please install [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install) and run Hermes Agent from there. The install command above works inside WSL2.
|
||||
:::
|
||||
|
||||
### What the Installer Does
|
||||
|
||||
The installer handles everything automatically — all dependencies (Python, Node.js, ripgrep, ffmpeg), the repo clone, virtual environment, global `hermes` command setup, and LLM provider configuration. By the end, you're ready to chat.
|
||||
|
||||
### After Installation
|
||||
|
||||
Reload your shell and start chatting:
|
||||
|
||||
```bash
|
||||
source ~/.bashrc # or: source ~/.zshrc
|
||||
hermes # Start chatting!
|
||||
```
|
||||
|
||||
To reconfigure individual settings later, use the dedicated commands:
|
||||
|
||||
```bash
|
||||
hermes model # Choose your LLM provider and model
|
||||
hermes tools # Configure which tools are enabled
|
||||
hermes gateway setup # Set up messaging platforms
|
||||
hermes config set # Set individual config values
|
||||
hermes setup # Or run the full setup wizard to configure everything at once
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
The only prerequisite is **Git**. The installer automatically handles everything else:
|
||||
|
||||
- **uv** (fast Python package manager)
|
||||
- **Python 3.11** (via uv, no sudo needed)
|
||||
- **Node.js v22** (for browser automation and WhatsApp bridge)
|
||||
- **ripgrep** (fast file search)
|
||||
- **ffmpeg** (audio format conversion for TTS)
|
||||
|
||||
:::info
|
||||
You do **not** need to install Python, Node.js, ripgrep, or ffmpeg manually. The installer detects what's missing and installs it for you. Just make sure `git` is available (`git --version`).
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Manual Installation
|
||||
|
||||
If you prefer full control over the installation process, follow these steps.
|
||||
|
||||
### Step 1: Clone the Repository
|
||||
|
||||
Clone with `--recurse-submodules` to pull the required submodules:
|
||||
|
||||
```bash
|
||||
git clone --recurse-submodules https://github.com/NousResearch/hermes-agent.git
|
||||
cd hermes-agent
|
||||
```
|
||||
|
||||
If you already cloned without `--recurse-submodules`:
|
||||
```bash
|
||||
git submodule update --init --recursive
|
||||
```
|
||||
|
||||
### Step 2: Install uv & Create Virtual Environment
|
||||
|
||||
```bash
|
||||
# Install uv (if not already installed)
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
|
||||
# Create venv with Python 3.11 (uv downloads it if not present — no sudo needed)
|
||||
uv venv venv --python 3.11
|
||||
```
|
||||
|
||||
:::tip
|
||||
You do **not** need to activate the venv to use `hermes`. The entry point has a hardcoded shebang pointing to the venv Python, so it works globally once symlinked.
|
||||
:::
|
||||
|
||||
### Step 3: Install Python Dependencies
|
||||
|
||||
```bash
|
||||
# Tell uv which venv to install into
|
||||
export VIRTUAL_ENV="$(pwd)/venv"
|
||||
|
||||
# Install with all extras
|
||||
uv pip install -e ".[all]"
|
||||
```
|
||||
|
||||
If you only want the core agent (no Telegram/Discord/cron support):
|
||||
```bash
|
||||
uv pip install -e "."
|
||||
```
|
||||
|
||||
<details>
|
||||
<summary><strong>Optional extras breakdown</strong></summary>
|
||||
|
||||
| Extra | What it adds | Install command |
|
||||
|-------|-------------|-----------------|
|
||||
| `all` | Everything below | `uv pip install -e ".[all]"` |
|
||||
| `messaging` | Telegram & Discord gateway | `uv pip install -e ".[messaging]"` |
|
||||
| `cron` | Cron expression parsing for scheduled tasks | `uv pip install -e ".[cron]"` |
|
||||
| `cli` | Terminal menu UI for setup wizard | `uv pip install -e ".[cli]"` |
|
||||
| `modal` | Modal cloud execution backend | `uv pip install -e ".[modal]"` |
|
||||
| `tts-premium` | ElevenLabs premium voices | `uv pip install -e ".[tts-premium]"` |
|
||||
| `voice` | CLI microphone input + audio playback | `uv pip install -e ".[voice]"` |
|
||||
| `pty` | PTY terminal support | `uv pip install -e ".[pty]"` |
|
||||
| `honcho` | AI-native memory (Honcho integration) | `uv pip install -e ".[honcho]"` |
|
||||
| `mcp` | Model Context Protocol support | `uv pip install -e ".[mcp]"` |
|
||||
| `homeassistant` | Home Assistant integration | `uv pip install -e ".[homeassistant]"` |
|
||||
| `acp` | ACP editor integration support | `uv pip install -e ".[acp]"` |
|
||||
| `slack` | Slack messaging | `uv pip install -e ".[slack]"` |
|
||||
| `dev` | pytest & test utilities | `uv pip install -e ".[dev]"` |
|
||||
|
||||
You can combine extras: `uv pip install -e ".[messaging,cron]"`
|
||||
|
||||
</details>
|
||||
|
||||
### Step 4: Install Optional Submodules (if needed)
|
||||
|
||||
```bash
|
||||
# RL training backend (optional)
|
||||
uv pip install -e "./tinker-atropos"
|
||||
```
|
||||
|
||||
Both are optional — if you skip them, the corresponding toolsets simply won't be available.
|
||||
|
||||
### Step 5: Install Node.js Dependencies (Optional)
|
||||
|
||||
Only needed for **browser automation** (Browserbase-powered) and **WhatsApp bridge**:
|
||||
|
||||
```bash
|
||||
npm install
|
||||
```
|
||||
|
||||
### Step 6: Create the Configuration Directory
|
||||
|
||||
```bash
|
||||
# Create the directory structure
|
||||
mkdir -p ~/.hermes/{cron,sessions,logs,memories,skills,pairing,hooks,image_cache,audio_cache,whatsapp/session}
|
||||
|
||||
# Copy the example config file
|
||||
cp cli-config.yaml.example ~/.hermes/config.yaml
|
||||
|
||||
# Create an empty .env file for API keys
|
||||
touch ~/.hermes/.env
|
||||
```
|
||||
|
||||
### Step 7: Add Your API Keys
|
||||
|
||||
Open `~/.hermes/.env` and add at minimum an LLM provider key:
|
||||
|
||||
```bash
|
||||
# Required — at least one LLM provider:
|
||||
OPENROUTER_API_KEY=sk-or-v1-your-key-here
|
||||
|
||||
# Optional — enable additional tools:
|
||||
FIRECRAWL_API_KEY=fc-your-key # Web search & scraping (or self-host, see docs)
|
||||
FAL_KEY=your-fal-key # Image generation (FLUX)
|
||||
```
|
||||
|
||||
Or set them via the CLI:
|
||||
```bash
|
||||
hermes config set OPENROUTER_API_KEY sk-or-v1-your-key-here
|
||||
```
|
||||
|
||||
### Step 8: Add `hermes` to Your PATH
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.local/bin
|
||||
ln -sf "$(pwd)/venv/bin/hermes" ~/.local/bin/hermes
|
||||
```
|
||||
|
||||
If `~/.local/bin` isn't on your PATH, add it to your shell config:
|
||||
|
||||
```bash
|
||||
# Bash
|
||||
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc && source ~/.bashrc
|
||||
|
||||
# Zsh
|
||||
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc && source ~/.zshrc
|
||||
|
||||
# Fish
|
||||
fish_add_path $HOME/.local/bin
|
||||
```
|
||||
|
||||
### Step 9: Configure Your Provider
|
||||
|
||||
```bash
|
||||
hermes model # Select your LLM provider and model
|
||||
```
|
||||
|
||||
### Step 10: Verify the Installation
|
||||
|
||||
```bash
|
||||
hermes version # Check that the command is available
|
||||
hermes doctor # Run diagnostics to verify everything is working
|
||||
hermes status # Check your configuration
|
||||
hermes chat -q "Hello! What tools do you have available?"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick-Reference: Manual Install (Condensed)
|
||||
|
||||
For those who just want the commands:
|
||||
|
||||
```bash
|
||||
# Install uv
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
|
||||
# Clone & enter
|
||||
git clone --recurse-submodules https://github.com/NousResearch/hermes-agent.git
|
||||
cd hermes-agent
|
||||
|
||||
# Create venv with Python 3.11
|
||||
uv venv venv --python 3.11
|
||||
export VIRTUAL_ENV="$(pwd)/venv"
|
||||
|
||||
# Install everything
|
||||
uv pip install -e ".[all]"
|
||||
uv pip install -e "./tinker-atropos"
|
||||
npm install # optional, for browser tools and WhatsApp
|
||||
|
||||
# Configure
|
||||
mkdir -p ~/.hermes/{cron,sessions,logs,memories,skills,pairing,hooks,image_cache,audio_cache,whatsapp/session}
|
||||
cp cli-config.yaml.example ~/.hermes/config.yaml
|
||||
touch ~/.hermes/.env
|
||||
echo 'OPENROUTER_API_KEY=sk-or-v1-your-key' >> ~/.hermes/.env
|
||||
|
||||
# Make hermes available globally
|
||||
mkdir -p ~/.local/bin
|
||||
ln -sf "$(pwd)/venv/bin/hermes" ~/.local/bin/hermes
|
||||
|
||||
# Verify
|
||||
hermes doctor
|
||||
hermes
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Solution |
|
||||
|---------|----------|
|
||||
| `hermes: command not found` | Reload your shell (`source ~/.bashrc`) or check PATH |
|
||||
| `API key not set` | Run `hermes model` to configure your provider, or `hermes config set OPENROUTER_API_KEY your_key` |
|
||||
| Missing config after update | Run `hermes config check` then `hermes config migrate` |
|
||||
|
||||
For more diagnostics, run `hermes doctor` — it will tell you exactly what's missing and how to fix it.
|
||||
152
hermes_code/website/docs/getting-started/learning-path.md
Normal file
152
hermes_code/website/docs/getting-started/learning-path.md
Normal file
|
|
@ -0,0 +1,152 @@
|
|||
---
|
||||
sidebar_position: 3
|
||||
title: 'Learning Path'
|
||||
description: 'Choose your learning path through the Hermes Agent documentation based on your experience level and goals.'
|
||||
---
|
||||
|
||||
# Learning Path
|
||||
|
||||
Hermes Agent can do a lot — CLI assistant, Telegram/Discord bot, task automation, RL training, and more. This page helps you figure out where to start and what to read based on your experience level and what you're trying to accomplish.
|
||||
|
||||
:::tip Start Here
|
||||
If you haven't installed Hermes Agent yet, begin with the [Installation guide](/docs/getting-started/installation) and then run through the [Quickstart](/docs/getting-started/quickstart). Everything below assumes you have a working installation.
|
||||
:::
|
||||
|
||||
## How to Use This Page
|
||||
|
||||
- **Know your level?** Jump to the [experience-level table](#by-experience-level) and follow the reading order for your tier.
|
||||
- **Have a specific goal?** Skip to [By Use Case](#by-use-case) and find the scenario that matches.
|
||||
- **Just browsing?** Check the [Key Features](#key-features-at-a-glance) table for a quick overview of everything Hermes Agent can do.
|
||||
|
||||
## By Experience Level
|
||||
|
||||
| Level | Goal | Recommended Reading | Time Estimate |
|
||||
|---|---|---|---|
|
||||
| **Beginner** | Get up and running, have basic conversations, use built-in tools | [Installation](/docs/getting-started/installation) → [Quickstart](/docs/getting-started/quickstart) → [CLI Usage](/docs/user-guide/cli) → [Configuration](/docs/user-guide/configuration) | ~1 hour |
|
||||
| **Intermediate** | Set up messaging bots, use advanced features like memory, cron jobs, and skills | [Sessions](/docs/user-guide/sessions) → [Messaging](/docs/user-guide/messaging) → [Tools](/docs/user-guide/features/tools) → [Skills](/docs/user-guide/features/skills) → [Memory](/docs/user-guide/features/memory) → [Cron](/docs/user-guide/features/cron) | ~2–3 hours |
|
||||
| **Advanced** | Build custom tools, create skills, train models with RL, contribute to the project | [Architecture](/docs/developer-guide/architecture) → [Adding Tools](/docs/developer-guide/adding-tools) → [Creating Skills](/docs/developer-guide/creating-skills) → [RL Training](/docs/user-guide/features/rl-training) → [Contributing](/docs/developer-guide/contributing) | ~4–6 hours |
|
||||
|
||||
## By Use Case
|
||||
|
||||
Pick the scenario that matches what you want to do. Each one links you to the relevant docs in the order you should read them.
|
||||
|
||||
### "I want a CLI coding assistant"
|
||||
|
||||
Use Hermes Agent as an interactive terminal assistant for writing, reviewing, and running code.
|
||||
|
||||
1. [Installation](/docs/getting-started/installation)
|
||||
2. [Quickstart](/docs/getting-started/quickstart)
|
||||
3. [CLI Usage](/docs/user-guide/cli)
|
||||
4. [Code Execution](/docs/user-guide/features/code-execution)
|
||||
5. [Context Files](/docs/user-guide/features/context-files)
|
||||
6. [Tips & Tricks](/docs/guides/tips)
|
||||
|
||||
:::tip
|
||||
Pass files directly into your conversation with context files. Hermes Agent can read, edit, and run code in your projects.
|
||||
:::
|
||||
|
||||
### "I want a Telegram/Discord bot"
|
||||
|
||||
Deploy Hermes Agent as a bot on your favorite messaging platform.
|
||||
|
||||
1. [Installation](/docs/getting-started/installation)
|
||||
2. [Configuration](/docs/user-guide/configuration)
|
||||
3. [Messaging Overview](/docs/user-guide/messaging)
|
||||
4. [Telegram Setup](/docs/user-guide/messaging/telegram)
|
||||
5. [Discord Setup](/docs/user-guide/messaging/discord)
|
||||
6. [Voice Mode](/docs/user-guide/features/voice-mode)
|
||||
7. [Use Voice Mode with Hermes](/docs/guides/use-voice-mode-with-hermes)
|
||||
8. [Security](/docs/user-guide/security)
|
||||
|
||||
For full project examples, see:
|
||||
- [Daily Briefing Bot](/docs/guides/daily-briefing-bot)
|
||||
- [Team Telegram Assistant](/docs/guides/team-telegram-assistant)
|
||||
|
||||
### "I want to automate tasks"
|
||||
|
||||
Schedule recurring tasks, run batch jobs, or chain agent actions together.
|
||||
|
||||
1. [Quickstart](/docs/getting-started/quickstart)
|
||||
2. [Cron Scheduling](/docs/user-guide/features/cron)
|
||||
3. [Batch Processing](/docs/user-guide/features/batch-processing)
|
||||
4. [Delegation](/docs/user-guide/features/delegation)
|
||||
5. [Hooks](/docs/user-guide/features/hooks)
|
||||
|
||||
:::tip
|
||||
Cron jobs let Hermes Agent run tasks on a schedule — daily summaries, periodic checks, automated reports — without you being present.
|
||||
:::
|
||||
|
||||
### "I want to build custom tools/skills"
|
||||
|
||||
Extend Hermes Agent with your own tools and reusable skill packages.
|
||||
|
||||
1. [Tools Overview](/docs/user-guide/features/tools)
|
||||
2. [Skills Overview](/docs/user-guide/features/skills)
|
||||
3. [MCP (Model Context Protocol)](/docs/user-guide/features/mcp)
|
||||
4. [Architecture](/docs/developer-guide/architecture)
|
||||
5. [Adding Tools](/docs/developer-guide/adding-tools)
|
||||
6. [Creating Skills](/docs/developer-guide/creating-skills)
|
||||
|
||||
:::tip
|
||||
Tools are individual functions the agent can call. Skills are bundles of tools, prompts, and configuration packaged together. Start with tools, graduate to skills.
|
||||
:::
|
||||
|
||||
### "I want to train models"
|
||||
|
||||
Use reinforcement learning to fine-tune model behavior with Hermes Agent's built-in RL training pipeline.
|
||||
|
||||
1. [Quickstart](/docs/getting-started/quickstart)
|
||||
2. [Configuration](/docs/user-guide/configuration)
|
||||
3. [RL Training](/docs/user-guide/features/rl-training)
|
||||
4. [Provider Routing](/docs/user-guide/features/provider-routing)
|
||||
5. [Architecture](/docs/developer-guide/architecture)
|
||||
|
||||
:::tip
|
||||
RL training works best when you already understand the basics of how Hermes Agent handles conversations and tool calls. Run through the Beginner path first if you're new.
|
||||
:::
|
||||
|
||||
### "I want to use it as a Python library"
|
||||
|
||||
Integrate Hermes Agent into your own Python applications programmatically.
|
||||
|
||||
1. [Installation](/docs/getting-started/installation)
|
||||
2. [Quickstart](/docs/getting-started/quickstart)
|
||||
3. [Python Library Guide](/docs/guides/python-library)
|
||||
4. [Architecture](/docs/developer-guide/architecture)
|
||||
5. [Tools](/docs/user-guide/features/tools)
|
||||
6. [Sessions](/docs/user-guide/sessions)
|
||||
|
||||
## Key Features at a Glance
|
||||
|
||||
Not sure what's available? Here's a quick directory of major features:
|
||||
|
||||
| Feature | What It Does | Link |
|
||||
|---|---|---|
|
||||
| **Tools** | Built-in tools the agent can call (file I/O, search, shell, etc.) | [Tools](/docs/user-guide/features/tools) |
|
||||
| **Skills** | Installable plugin packages that add new capabilities | [Skills](/docs/user-guide/features/skills) |
|
||||
| **Memory** | Persistent memory across sessions | [Memory](/docs/user-guide/features/memory) |
|
||||
| **Context Files** | Feed files and directories into conversations | [Context Files](/docs/user-guide/features/context-files) |
|
||||
| **MCP** | Connect to external tool servers via Model Context Protocol | [MCP](/docs/user-guide/features/mcp) |
|
||||
| **Cron** | Schedule recurring agent tasks | [Cron](/docs/user-guide/features/cron) |
|
||||
| **Delegation** | Spawn sub-agents for parallel work | [Delegation](/docs/user-guide/features/delegation) |
|
||||
| **Code Execution** | Run code in sandboxed environments | [Code Execution](/docs/user-guide/features/code-execution) |
|
||||
| **Browser** | Web browsing and scraping | [Browser](/docs/user-guide/features/browser) |
|
||||
| **Hooks** | Event-driven callbacks and middleware | [Hooks](/docs/user-guide/features/hooks) |
|
||||
| **Batch Processing** | Process multiple inputs in bulk | [Batch Processing](/docs/user-guide/features/batch-processing) |
|
||||
| **RL Training** | Fine-tune models with reinforcement learning | [RL Training](/docs/user-guide/features/rl-training) |
|
||||
| **Provider Routing** | Route requests across multiple LLM providers | [Provider Routing](/docs/user-guide/features/provider-routing) |
|
||||
|
||||
## What to Read Next
|
||||
|
||||
Based on where you are right now:
|
||||
|
||||
- **Just finished installing?** → Head to the [Quickstart](/docs/getting-started/quickstart) to run your first conversation.
|
||||
- **Completed the Quickstart?** → Read [CLI Usage](/docs/user-guide/cli) and [Configuration](/docs/user-guide/configuration) to customize your setup.
|
||||
- **Comfortable with the basics?** → Explore [Tools](/docs/user-guide/features/tools), [Skills](/docs/user-guide/features/skills), and [Memory](/docs/user-guide/features/memory) to unlock the full power of the agent.
|
||||
- **Setting up for a team?** → Read [Security](/docs/user-guide/security) and [Sessions](/docs/user-guide/sessions) to understand access control and conversation management.
|
||||
- **Ready to build?** → Jump into the [Developer Guide](/docs/developer-guide/architecture) to understand the internals and start contributing.
|
||||
- **Want practical examples?** → Check out the [Guides](/docs/guides/tips) section for real-world projects and tips.
|
||||
|
||||
:::tip
|
||||
You don't need to read everything. Pick the path that matches your goal, follow the links in order, and you'll be productive quickly. You can always come back to this page to find your next step.
|
||||
:::
|
||||
227
hermes_code/website/docs/getting-started/quickstart.md
Normal file
227
hermes_code/website/docs/getting-started/quickstart.md
Normal file
|
|
@ -0,0 +1,227 @@
|
|||
---
|
||||
sidebar_position: 1
|
||||
title: "Quickstart"
|
||||
description: "Your first conversation with Hermes Agent — from install to chatting in 2 minutes"
|
||||
---
|
||||
|
||||
# Quickstart
|
||||
|
||||
This guide walks you through installing Hermes Agent, setting up a provider, and having your first conversation. By the end, you'll know the key features and how to explore further.
|
||||
|
||||
## 1. Install Hermes Agent
|
||||
|
||||
Run the one-line installer:
|
||||
|
||||
```bash
|
||||
# Linux / macOS / WSL2
|
||||
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
|
||||
```
|
||||
|
||||
:::tip Windows Users
|
||||
Install [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install) first, then run the command above inside your WSL2 terminal.
|
||||
:::
|
||||
|
||||
After it finishes, reload your shell:
|
||||
|
||||
```bash
|
||||
source ~/.bashrc # or source ~/.zshrc
|
||||
```
|
||||
|
||||
## 2. Set Up a Provider
|
||||
|
||||
The installer configures your LLM provider automatically. To change it later, use one of these commands:
|
||||
|
||||
```bash
|
||||
hermes model # Choose your LLM provider and model
|
||||
hermes tools # Configure which tools are enabled
|
||||
hermes setup # Or configure everything at once
|
||||
```
|
||||
|
||||
`hermes model` walks you through selecting an inference provider:
|
||||
|
||||
| Provider | What it is | How to set up |
|
||||
|----------|-----------|---------------|
|
||||
| **Nous Portal** | Subscription-based, zero-config | OAuth login via `hermes model` |
|
||||
| **OpenAI Codex** | ChatGPT OAuth, uses Codex models | Device code auth via `hermes model` |
|
||||
| **Anthropic** | Claude models directly (Pro/Max or API key) | `hermes model` with Claude Code auth, or an Anthropic API key |
|
||||
| **OpenRouter** | Multi-provider routing across many models | Enter your API key |
|
||||
| **Z.AI** | GLM / Zhipu-hosted models | Set `GLM_API_KEY` / `ZAI_API_KEY` |
|
||||
| **Kimi / Moonshot** | Moonshot-hosted coding and chat models | Set `KIMI_API_KEY` |
|
||||
| **MiniMax** | International MiniMax endpoint | Set `MINIMAX_API_KEY` |
|
||||
| **MiniMax China** | China-region MiniMax endpoint | Set `MINIMAX_CN_API_KEY` |
|
||||
| **Alibaba Cloud** | Qwen models via DashScope | Set `DASHSCOPE_API_KEY` |
|
||||
| **Kilo Code** | KiloCode-hosted models | Set `KILOCODE_API_KEY` |
|
||||
| **OpenCode Zen** | Pay-as-you-go access to curated models | Set `OPENCODE_ZEN_API_KEY` |
|
||||
| **OpenCode Go** | $10/month subscription for open models | Set `OPENCODE_GO_API_KEY` |
|
||||
| **Vercel AI Gateway** | Vercel AI Gateway routing | Set `AI_GATEWAY_API_KEY` |
|
||||
| **Custom Endpoint** | VLLM, SGLang, Ollama, or any OpenAI-compatible API | Set base URL + API key |
|
||||
|
||||
:::tip
|
||||
You can switch providers at any time with `hermes model` — no code changes, no lock-in. When configuring a custom endpoint, Hermes will prompt for the context window size and auto-detect it when possible. See [Context Length Detection](../user-guide/configuration.md#context-length-detection) for details.
|
||||
:::
|
||||
|
||||
## 3. Start Chatting
|
||||
|
||||
```bash
|
||||
hermes
|
||||
```
|
||||
|
||||
That's it! You'll see a welcome banner with your model, available tools, and skills. Type a message and press Enter.
|
||||
|
||||
```
|
||||
❯ What can you help me with?
|
||||
```
|
||||
|
||||
The agent has access to tools for web search, file operations, terminal commands, and more — all out of the box.
|
||||
|
||||
## 4. Try Key Features
|
||||
|
||||
### Ask it to use the terminal
|
||||
|
||||
```
|
||||
❯ What's my disk usage? Show the top 5 largest directories.
|
||||
```
|
||||
|
||||
The agent will run terminal commands on your behalf and show you the results.
|
||||
|
||||
### Use slash commands
|
||||
|
||||
Type `/` to see an autocomplete dropdown of all commands:
|
||||
|
||||
| Command | What it does |
|
||||
|---------|-------------|
|
||||
| `/help` | Show all available commands |
|
||||
| `/tools` | List available tools |
|
||||
| `/model` | Switch models interactively |
|
||||
| `/personality pirate` | Try a fun personality |
|
||||
| `/save` | Save the conversation |
|
||||
|
||||
### Multi-line input
|
||||
|
||||
Press `Alt+Enter` or `Ctrl+J` to add a new line. Great for pasting code or writing detailed prompts.
|
||||
|
||||
### Interrupt the agent
|
||||
|
||||
If the agent is taking too long, just type a new message and press Enter — it interrupts the current task and switches to your new instructions. `Ctrl+C` also works.
|
||||
|
||||
### Resume a session
|
||||
|
||||
When you exit, hermes prints a resume command:
|
||||
|
||||
```bash
|
||||
hermes --continue # Resume the most recent session
|
||||
hermes -c # Short form
|
||||
```
|
||||
|
||||
## 5. Explore Further
|
||||
|
||||
Here are some things to try next:
|
||||
|
||||
### Set up a sandboxed terminal
|
||||
|
||||
For safety, run the agent in a Docker container or on a remote server:
|
||||
|
||||
```bash
|
||||
hermes config set terminal.backend docker # Docker isolation
|
||||
hermes config set terminal.backend ssh # Remote server
|
||||
```
|
||||
|
||||
### Connect messaging platforms
|
||||
|
||||
Chat with Hermes from your phone or other surfaces via Telegram, Discord, Slack, WhatsApp, Signal, Email, or Home Assistant:
|
||||
|
||||
```bash
|
||||
hermes gateway setup # Interactive platform configuration
|
||||
```
|
||||
|
||||
### Add voice mode
|
||||
|
||||
Want microphone input in the CLI or spoken replies in messaging?
|
||||
|
||||
```bash
|
||||
pip install "hermes-agent[voice]"
|
||||
|
||||
# Optional but recommended for free local speech-to-text
|
||||
pip install faster-whisper
|
||||
```
|
||||
|
||||
Then start Hermes and enable it inside the CLI:
|
||||
|
||||
```text
|
||||
/voice on
|
||||
```
|
||||
|
||||
Press `Ctrl+B` to record, or use `/voice tts` to have Hermes speak its replies. See [Voice Mode](../user-guide/features/voice-mode.md) for the full setup across CLI, Telegram, Discord, and Discord voice channels.
|
||||
|
||||
### Schedule automated tasks
|
||||
|
||||
```
|
||||
❯ Every morning at 9am, check Hacker News for AI news and send me a summary on Telegram.
|
||||
```
|
||||
|
||||
The agent will set up a cron job that runs automatically via the gateway.
|
||||
|
||||
### Browse and install skills
|
||||
|
||||
```bash
|
||||
hermes skills search kubernetes
|
||||
hermes skills search react --source skills-sh
|
||||
hermes skills search https://mintlify.com/docs --source well-known
|
||||
hermes skills install openai/skills/k8s
|
||||
hermes skills install official/security/1password
|
||||
hermes skills install skills-sh/vercel-labs/json-render/json-render-react --force
|
||||
```
|
||||
|
||||
Tips:
|
||||
- Use `--source skills-sh` to search the public `skills.sh` directory.
|
||||
- Use `--source well-known` with a docs/site URL to discover skills from `/.well-known/skills/index.json`.
|
||||
- Use `--force` only after reviewing a third-party skill. It can override non-dangerous policy blocks, but not a `dangerous` scan verdict.
|
||||
|
||||
Or use the `/skills` slash command inside chat.
|
||||
|
||||
### Use Hermes inside an editor via ACP
|
||||
|
||||
Hermes can also run as an ACP server for ACP-compatible editors like VS Code, Zed, and JetBrains:
|
||||
|
||||
```bash
|
||||
pip install -e '.[acp]'
|
||||
hermes acp
|
||||
```
|
||||
|
||||
See [ACP Editor Integration](../user-guide/features/acp.md) for setup details.
|
||||
|
||||
### Try MCP servers
|
||||
|
||||
Connect to external tools via the Model Context Protocol:
|
||||
|
||||
```yaml
|
||||
# Add to ~/.hermes/config.yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: npx
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_xxx"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `hermes` | Start chatting |
|
||||
| `hermes model` | Choose your LLM provider and model |
|
||||
| `hermes tools` | Configure which tools are enabled per platform |
|
||||
| `hermes setup` | Full setup wizard (configures everything at once) |
|
||||
| `hermes doctor` | Diagnose issues |
|
||||
| `hermes update` | Update to latest version |
|
||||
| `hermes gateway` | Start the messaging gateway |
|
||||
| `hermes --continue` | Resume last session |
|
||||
|
||||
## Next Steps
|
||||
|
||||
- **[CLI Guide](../user-guide/cli.md)** — Master the terminal interface
|
||||
- **[Configuration](../user-guide/configuration.md)** — Customize your setup
|
||||
- **[Messaging Gateway](../user-guide/messaging/index.md)** — Connect Telegram, Discord, Slack, WhatsApp, Signal, Email, or Home Assistant
|
||||
- **[Tools & Toolsets](../user-guide/features/tools.md)** — Explore available capabilities
|
||||
79
hermes_code/website/docs/getting-started/updating.md
Normal file
79
hermes_code/website/docs/getting-started/updating.md
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
---
|
||||
sidebar_position: 3
|
||||
title: "Updating & Uninstalling"
|
||||
description: "How to update Hermes Agent to the latest version or uninstall it"
|
||||
---
|
||||
|
||||
# Updating & Uninstalling
|
||||
|
||||
## Updating
|
||||
|
||||
Update to the latest version with a single command:
|
||||
|
||||
```bash
|
||||
hermes update
|
||||
```
|
||||
|
||||
This pulls the latest code, updates dependencies, and prompts you to configure any new options that were added since your last update.
|
||||
|
||||
:::tip
|
||||
`hermes update` automatically detects new configuration options and prompts you to add them. If you skipped that prompt, you can manually run `hermes config check` to see missing options, then `hermes config migrate` to interactively add them.
|
||||
:::
|
||||
|
||||
### Updating from Messaging Platforms
|
||||
|
||||
You can also update directly from Telegram, Discord, Slack, or WhatsApp by sending:
|
||||
|
||||
```
|
||||
/update
|
||||
```
|
||||
|
||||
This pulls the latest code, updates dependencies, and restarts the gateway.
|
||||
|
||||
### Manual Update
|
||||
|
||||
If you installed manually (not via the quick installer):
|
||||
|
||||
```bash
|
||||
cd /path/to/hermes-agent
|
||||
export VIRTUAL_ENV="$(pwd)/venv"
|
||||
|
||||
# Pull latest code and submodules
|
||||
git pull origin main
|
||||
git submodule update --init --recursive
|
||||
|
||||
# Reinstall (picks up new dependencies)
|
||||
uv pip install -e ".[all]"
|
||||
uv pip install -e "./tinker-atropos"
|
||||
|
||||
# Check for new config options
|
||||
hermes config check
|
||||
hermes config migrate # Interactively add any missing options
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Uninstalling
|
||||
|
||||
```bash
|
||||
hermes uninstall
|
||||
```
|
||||
|
||||
The uninstaller gives you the option to keep your configuration files (`~/.hermes/`) for a future reinstall.
|
||||
|
||||
### Manual Uninstall
|
||||
|
||||
```bash
|
||||
rm -f ~/.local/bin/hermes
|
||||
rm -rf /path/to/hermes-agent
|
||||
rm -rf ~/.hermes # Optional — keep if you plan to reinstall
|
||||
```
|
||||
|
||||
:::info
|
||||
If you installed the gateway as a system service, stop and disable it first:
|
||||
```bash
|
||||
hermes gateway stop
|
||||
# Linux: systemctl --user disable hermes-gateway
|
||||
# macOS: launchctl remove ai.hermes.gateway
|
||||
```
|
||||
:::
|
||||
6
hermes_code/website/docs/guides/_category_.json
Normal file
6
hermes_code/website/docs/guides/_category_.json
Normal file
|
|
@ -0,0 +1,6 @@
|
|||
{
|
||||
"label": "Guides & Tutorials",
|
||||
"position": 2,
|
||||
"collapsible": true,
|
||||
"collapsed": false
|
||||
}
|
||||
441
hermes_code/website/docs/guides/build-a-hermes-plugin.md
Normal file
441
hermes_code/website/docs/guides/build-a-hermes-plugin.md
Normal file
|
|
@ -0,0 +1,441 @@
|
|||
---
|
||||
sidebar_position: 10
|
||||
---
|
||||
|
||||
# Build a Hermes Plugin
|
||||
|
||||
This guide walks through building a complete Hermes plugin from scratch. By the end you'll have a working plugin with multiple tools, lifecycle hooks, shipped data files, and a bundled skill — everything the plugin system supports.
|
||||
|
||||
## What you're building
|
||||
|
||||
A **calculator** plugin with two tools:
|
||||
- `calculate` — evaluate math expressions (`2**16`, `sqrt(144)`, `pi * 5**2`)
|
||||
- `unit_convert` — convert between units (`100 F → 37.78 C`, `5 km → 3.11 mi`)
|
||||
|
||||
Plus a hook that logs every tool call, and a bundled skill file.
|
||||
|
||||
## Step 1: Create the plugin directory
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.hermes/plugins/calculator
|
||||
cd ~/.hermes/plugins/calculator
|
||||
```
|
||||
|
||||
## Step 2: Write the manifest
|
||||
|
||||
Create `plugin.yaml`:
|
||||
|
||||
```yaml
|
||||
name: calculator
|
||||
version: 1.0.0
|
||||
description: Math calculator — evaluate expressions and convert units
|
||||
provides_tools:
|
||||
- calculate
|
||||
- unit_convert
|
||||
provides_hooks:
|
||||
- post_tool_call
|
||||
```
|
||||
|
||||
This tells Hermes: "I'm a plugin called calculator, I provide tools and hooks." The `provides_tools` and `provides_hooks` fields are lists of what the plugin registers.
|
||||
|
||||
Optional fields you could add:
|
||||
```yaml
|
||||
author: Your Name
|
||||
requires_env: # gate loading on env vars
|
||||
- SOME_API_KEY # plugin disabled if missing
|
||||
```
|
||||
|
||||
## Step 3: Write the tool schemas
|
||||
|
||||
Create `schemas.py` — this is what the LLM reads to decide when to call your tools:
|
||||
|
||||
```python
|
||||
"""Tool schemas — what the LLM sees."""
|
||||
|
||||
CALCULATE = {
|
||||
"name": "calculate",
|
||||
"description": (
|
||||
"Evaluate a mathematical expression and return the result. "
|
||||
"Supports arithmetic (+, -, *, /, **), functions (sqrt, sin, cos, "
|
||||
"log, abs, round, floor, ceil), and constants (pi, e). "
|
||||
"Use this for any math the user asks about."
|
||||
),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"expression": {
|
||||
"type": "string",
|
||||
"description": "Math expression to evaluate (e.g., '2**10', 'sqrt(144)')",
|
||||
},
|
||||
},
|
||||
"required": ["expression"],
|
||||
},
|
||||
}
|
||||
|
||||
UNIT_CONVERT = {
|
||||
"name": "unit_convert",
|
||||
"description": (
|
||||
"Convert a value between units. Supports length (m, km, mi, ft, in), "
|
||||
"weight (kg, lb, oz, g), temperature (C, F, K), data (B, KB, MB, GB, TB), "
|
||||
"and time (s, min, hr, day)."
|
||||
),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"value": {
|
||||
"type": "number",
|
||||
"description": "The numeric value to convert",
|
||||
},
|
||||
"from_unit": {
|
||||
"type": "string",
|
||||
"description": "Source unit (e.g., 'km', 'lb', 'F', 'GB')",
|
||||
},
|
||||
"to_unit": {
|
||||
"type": "string",
|
||||
"description": "Target unit (e.g., 'mi', 'kg', 'C', 'MB')",
|
||||
},
|
||||
},
|
||||
"required": ["value", "from_unit", "to_unit"],
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
**Why schemas matter:** The `description` field is how the LLM decides when to use your tool. Be specific about what it does and when to use it. The `parameters` define what arguments the LLM passes.
|
||||
|
||||
## Step 4: Write the tool handlers
|
||||
|
||||
Create `tools.py` — this is the code that actually executes when the LLM calls your tools:
|
||||
|
||||
```python
|
||||
"""Tool handlers — the code that runs when the LLM calls each tool."""
|
||||
|
||||
import json
|
||||
import math
|
||||
|
||||
# Safe globals for expression evaluation — no file/network access
|
||||
_SAFE_MATH = {
|
||||
"abs": abs, "round": round, "min": min, "max": max,
|
||||
"pow": pow, "sqrt": math.sqrt, "sin": math.sin, "cos": math.cos,
|
||||
"tan": math.tan, "log": math.log, "log2": math.log2, "log10": math.log10,
|
||||
"floor": math.floor, "ceil": math.ceil,
|
||||
"pi": math.pi, "e": math.e,
|
||||
"factorial": math.factorial,
|
||||
}
|
||||
|
||||
|
||||
def calculate(args: dict, **kwargs) -> str:
|
||||
"""Evaluate a math expression safely.
|
||||
|
||||
Rules for handlers:
|
||||
1. Receive args (dict) — the parameters the LLM passed
|
||||
2. Do the work
|
||||
3. Return a JSON string — ALWAYS, even on error
|
||||
4. Accept **kwargs for forward compatibility
|
||||
"""
|
||||
expression = args.get("expression", "").strip()
|
||||
if not expression:
|
||||
return json.dumps({"error": "No expression provided"})
|
||||
|
||||
try:
|
||||
result = eval(expression, {"__builtins__": {}}, _SAFE_MATH)
|
||||
return json.dumps({"expression": expression, "result": result})
|
||||
except ZeroDivisionError:
|
||||
return json.dumps({"expression": expression, "error": "Division by zero"})
|
||||
except Exception as e:
|
||||
return json.dumps({"expression": expression, "error": f"Invalid: {e}"})
|
||||
|
||||
|
||||
# Conversion tables — values are in base units
|
||||
_LENGTH = {"m": 1, "km": 1000, "mi": 1609.34, "ft": 0.3048, "in": 0.0254, "cm": 0.01}
|
||||
_WEIGHT = {"kg": 1, "g": 0.001, "lb": 0.453592, "oz": 0.0283495}
|
||||
_DATA = {"B": 1, "KB": 1024, "MB": 1024**2, "GB": 1024**3, "TB": 1024**4}
|
||||
_TIME = {"s": 1, "ms": 0.001, "min": 60, "hr": 3600, "day": 86400}
|
||||
|
||||
|
||||
def _convert_temp(value, from_u, to_u):
|
||||
# Normalize to Celsius
|
||||
c = {"F": (value - 32) * 5/9, "K": value - 273.15}.get(from_u, value)
|
||||
# Convert to target
|
||||
return {"F": c * 9/5 + 32, "K": c + 273.15}.get(to_u, c)
|
||||
|
||||
|
||||
def unit_convert(args: dict, **kwargs) -> str:
|
||||
"""Convert between units."""
|
||||
value = args.get("value")
|
||||
from_unit = args.get("from_unit", "").strip()
|
||||
to_unit = args.get("to_unit", "").strip()
|
||||
|
||||
if value is None or not from_unit or not to_unit:
|
||||
return json.dumps({"error": "Need value, from_unit, and to_unit"})
|
||||
|
||||
try:
|
||||
# Temperature
|
||||
if from_unit.upper() in {"C","F","K"} and to_unit.upper() in {"C","F","K"}:
|
||||
result = _convert_temp(float(value), from_unit.upper(), to_unit.upper())
|
||||
return json.dumps({"input": f"{value} {from_unit}", "result": round(result, 4),
|
||||
"output": f"{round(result, 4)} {to_unit}"})
|
||||
|
||||
# Ratio-based conversions
|
||||
for table in (_LENGTH, _WEIGHT, _DATA, _TIME):
|
||||
lc = {k.lower(): v for k, v in table.items()}
|
||||
if from_unit.lower() in lc and to_unit.lower() in lc:
|
||||
result = float(value) * lc[from_unit.lower()] / lc[to_unit.lower()]
|
||||
return json.dumps({"input": f"{value} {from_unit}",
|
||||
"result": round(result, 6),
|
||||
"output": f"{round(result, 6)} {to_unit}"})
|
||||
|
||||
return json.dumps({"error": f"Cannot convert {from_unit} → {to_unit}"})
|
||||
except Exception as e:
|
||||
return json.dumps({"error": f"Conversion failed: {e}"})
|
||||
```
|
||||
|
||||
**Key rules for handlers:**
|
||||
1. **Signature:** `def my_handler(args: dict, **kwargs) -> str`
|
||||
2. **Return:** Always a JSON string. Success and errors alike.
|
||||
3. **Never raise:** Catch all exceptions, return error JSON instead.
|
||||
4. **Accept `**kwargs`:** Hermes may pass additional context in the future.
|
||||
|
||||
## Step 5: Write the registration
|
||||
|
||||
Create `__init__.py` — this wires schemas to handlers:
|
||||
|
||||
```python
|
||||
"""Calculator plugin — registration."""
|
||||
|
||||
import logging
|
||||
|
||||
from . import schemas, tools
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Track tool usage via hooks
|
||||
_call_log = []
|
||||
|
||||
def _on_post_tool_call(tool_name, args, result, task_id, **kwargs):
|
||||
"""Hook: runs after every tool call (not just ours)."""
|
||||
_call_log.append({"tool": tool_name, "session": task_id})
|
||||
if len(_call_log) > 100:
|
||||
_call_log.pop(0)
|
||||
logger.debug("Tool called: %s (session %s)", tool_name, task_id)
|
||||
|
||||
|
||||
def register(ctx):
|
||||
"""Wire schemas to handlers and register hooks."""
|
||||
ctx.register_tool(name="calculate", toolset="calculator",
|
||||
schema=schemas.CALCULATE, handler=tools.calculate)
|
||||
ctx.register_tool(name="unit_convert", toolset="calculator",
|
||||
schema=schemas.UNIT_CONVERT, handler=tools.unit_convert)
|
||||
|
||||
# This hook fires for ALL tool calls, not just ours
|
||||
ctx.register_hook("post_tool_call", _on_post_tool_call)
|
||||
```
|
||||
|
||||
**What `register()` does:**
|
||||
- Called exactly once at startup
|
||||
- `ctx.register_tool()` puts your tool in the registry — the model sees it immediately
|
||||
- `ctx.register_hook()` subscribes to lifecycle events
|
||||
- `ctx.register_command()` — _planned but not yet implemented_
|
||||
- If this function crashes, the plugin is disabled but Hermes continues fine
|
||||
|
||||
## Step 6: Test it
|
||||
|
||||
Start Hermes:
|
||||
|
||||
```bash
|
||||
hermes
|
||||
```
|
||||
|
||||
You should see `calculator: calculate, unit_convert` in the banner's tool list.
|
||||
|
||||
Try these prompts:
|
||||
```
|
||||
What's 2 to the power of 16?
|
||||
Convert 100 fahrenheit to celsius
|
||||
What's the square root of 2 times pi?
|
||||
How many gigabytes is 1.5 terabytes?
|
||||
```
|
||||
|
||||
Check plugin status:
|
||||
```
|
||||
/plugins
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
Plugins (1):
|
||||
✓ calculator v1.0.0 (2 tools, 1 hooks)
|
||||
```
|
||||
|
||||
## Your plugin's final structure
|
||||
|
||||
```
|
||||
~/.hermes/plugins/calculator/
|
||||
├── plugin.yaml # "I'm calculator, I provide tools and hooks"
|
||||
├── __init__.py # Wiring: schemas → handlers, register hooks
|
||||
├── schemas.py # What the LLM reads (descriptions + parameter specs)
|
||||
└── tools.py # What runs (calculate, unit_convert functions)
|
||||
```
|
||||
|
||||
Four files, clear separation:
|
||||
- **Manifest** declares what the plugin is
|
||||
- **Schemas** describe tools for the LLM
|
||||
- **Handlers** implement the actual logic
|
||||
- **Registration** connects everything
|
||||
|
||||
## What else can plugins do?
|
||||
|
||||
### Ship data files
|
||||
|
||||
Put any files in your plugin directory and read them at import time:
|
||||
|
||||
```python
|
||||
# In tools.py or __init__.py
|
||||
from pathlib import Path
|
||||
|
||||
_PLUGIN_DIR = Path(__file__).parent
|
||||
_DATA_FILE = _PLUGIN_DIR / "data" / "languages.yaml"
|
||||
|
||||
with open(_DATA_FILE) as f:
|
||||
_DATA = yaml.safe_load(f)
|
||||
```
|
||||
|
||||
### Bundle a skill
|
||||
|
||||
Include a `skill.md` file and install it during registration:
|
||||
|
||||
```python
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
|
||||
def _install_skill():
|
||||
"""Copy our skill to ~/.hermes/skills/ on first load."""
|
||||
try:
|
||||
from hermes_cli.config import get_hermes_home
|
||||
dest = get_hermes_home() / "skills" / "my-plugin" / "SKILL.md"
|
||||
except Exception:
|
||||
dest = Path.home() / ".hermes" / "skills" / "my-plugin" / "SKILL.md"
|
||||
|
||||
if dest.exists():
|
||||
return # don't overwrite user edits
|
||||
|
||||
source = Path(__file__).parent / "skill.md"
|
||||
if source.exists():
|
||||
dest.parent.mkdir(parents=True, exist_ok=True)
|
||||
shutil.copy2(source, dest)
|
||||
|
||||
def register(ctx):
|
||||
ctx.register_tool(...)
|
||||
_install_skill()
|
||||
```
|
||||
|
||||
### Gate on environment variables
|
||||
|
||||
If your plugin needs an API key:
|
||||
|
||||
```yaml
|
||||
# plugin.yaml
|
||||
requires_env:
|
||||
- WEATHER_API_KEY
|
||||
```
|
||||
|
||||
If `WEATHER_API_KEY` isn't set, the plugin is disabled with a clear message. No crash, no error in the agent — just "Plugin weather disabled (missing: WEATHER_API_KEY)".
|
||||
|
||||
### Conditional tool availability
|
||||
|
||||
For tools that depend on optional libraries:
|
||||
|
||||
```python
|
||||
ctx.register_tool(
|
||||
name="my_tool",
|
||||
schema={...},
|
||||
handler=my_handler,
|
||||
check_fn=lambda: _has_optional_lib(), # False = tool hidden from model
|
||||
)
|
||||
```
|
||||
|
||||
### Register multiple hooks
|
||||
|
||||
```python
|
||||
def register(ctx):
|
||||
ctx.register_hook("pre_tool_call", before_any_tool)
|
||||
ctx.register_hook("post_tool_call", after_any_tool)
|
||||
ctx.register_hook("on_session_start", on_new_session)
|
||||
ctx.register_hook("on_session_end", on_session_end)
|
||||
```
|
||||
|
||||
Available hooks:
|
||||
|
||||
| Hook | When | Arguments |
|
||||
|------|------|-----------|
|
||||
| `pre_tool_call` | Before any tool runs | `tool_name`, `args`, `task_id` |
|
||||
| `post_tool_call` | After any tool returns | `tool_name`, `args`, `result`, `task_id` |
|
||||
| `pre_llm_call` | Before LLM API call | `messages`, `model` |
|
||||
| `post_llm_call` | After LLM response | `messages`, `response`, `model` |
|
||||
| `on_session_start` | Session begins | `session_id`, `platform` |
|
||||
| `on_session_end` | Session ends | `session_id`, `platform` |
|
||||
|
||||
Hooks are observers — they can't modify arguments or return values. If a hook crashes, it's logged and skipped; other hooks and the tool continue normally.
|
||||
|
||||
### Distribute via pip
|
||||
|
||||
For sharing plugins publicly, add an entry point to your Python package:
|
||||
|
||||
```toml
|
||||
# pyproject.toml
|
||||
[project.entry-points."hermes_agent.plugins"]
|
||||
my-plugin = "my_plugin_package"
|
||||
```
|
||||
|
||||
```bash
|
||||
pip install hermes-plugin-calculator
|
||||
# Plugin auto-discovered on next hermes startup
|
||||
```
|
||||
|
||||
## Common mistakes
|
||||
|
||||
**Handler doesn't return JSON string:**
|
||||
```python
|
||||
# Wrong — returns a dict
|
||||
def handler(args, **kwargs):
|
||||
return {"result": 42}
|
||||
|
||||
# Right — returns a JSON string
|
||||
def handler(args, **kwargs):
|
||||
return json.dumps({"result": 42})
|
||||
```
|
||||
|
||||
**Missing `**kwargs` in handler signature:**
|
||||
```python
|
||||
# Wrong — will break if Hermes passes extra context
|
||||
def handler(args):
|
||||
...
|
||||
|
||||
# Right
|
||||
def handler(args, **kwargs):
|
||||
...
|
||||
```
|
||||
|
||||
**Handler raises exceptions:**
|
||||
```python
|
||||
# Wrong — exception propagates, tool call fails
|
||||
def handler(args, **kwargs):
|
||||
result = 1 / int(args["value"]) # ZeroDivisionError!
|
||||
return json.dumps({"result": result})
|
||||
|
||||
# Right — catch and return error JSON
|
||||
def handler(args, **kwargs):
|
||||
try:
|
||||
result = 1 / int(args.get("value", 0))
|
||||
return json.dumps({"result": result})
|
||||
except Exception as e:
|
||||
return json.dumps({"error": str(e)})
|
||||
```
|
||||
|
||||
**Schema description too vague:**
|
||||
```python
|
||||
# Bad — model doesn't know when to use it
|
||||
"description": "Does stuff"
|
||||
|
||||
# Good — model knows exactly when and how
|
||||
"description": "Evaluate a mathematical expression. Use for arithmetic, trig, logarithms. Supports: +, -, *, /, **, sqrt, sin, cos, log, pi, e."
|
||||
```
|
||||
266
hermes_code/website/docs/guides/daily-briefing-bot.md
Normal file
266
hermes_code/website/docs/guides/daily-briefing-bot.md
Normal file
|
|
@ -0,0 +1,266 @@
|
|||
---
|
||||
sidebar_position: 2
|
||||
title: "Tutorial: Daily Briefing Bot"
|
||||
description: "Build an automated daily briefing bot that researches topics, summarizes findings, and delivers them to Telegram or Discord every morning"
|
||||
---
|
||||
|
||||
# Tutorial: Build a Daily Briefing Bot
|
||||
|
||||
In this tutorial, you'll build a personal briefing bot that wakes up every morning, researches topics you care about, summarizes the findings, and delivers a concise briefing straight to your Telegram or Discord.
|
||||
|
||||
By the end, you'll have a fully automated workflow combining **web search**, **cron scheduling**, **delegation**, and **messaging delivery** — no code required.
|
||||
|
||||
## What We're Building
|
||||
|
||||
Here's the flow:
|
||||
|
||||
1. **8:00 AM** — The cron scheduler triggers your job
|
||||
2. **Hermes spins up** a fresh agent session with your prompt
|
||||
3. **Web search** pulls the latest news on your topics
|
||||
4. **Summarization** distills it into a clean briefing format
|
||||
5. **Delivery** sends the briefing to your Telegram or Discord
|
||||
|
||||
The whole thing runs hands-free. You just read your briefing with your morning coffee.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before starting, make sure you have:
|
||||
|
||||
- **Hermes Agent installed** — see the [Installation guide](/docs/getting-started/installation)
|
||||
- **Gateway running** — the gateway daemon handles cron execution:
|
||||
```bash
|
||||
hermes gateway install # Install as a user service
|
||||
sudo hermes gateway install --system # Linux servers: boot-time system service
|
||||
# or
|
||||
hermes gateway # Run in foreground
|
||||
```
|
||||
- **Firecrawl API key** — set `FIRECRAWL_API_KEY` in your environment for web search
|
||||
- **Messaging configured** (optional but recommended) — [Telegram](/docs/user-guide/messaging/telegram) or Discord set up with a home channel
|
||||
|
||||
:::tip No messaging? No problem
|
||||
You can still follow this tutorial using `deliver: "local"`. Briefings will be saved to `~/.hermes/cron/output/` and you can read them anytime.
|
||||
:::
|
||||
|
||||
## Step 1: Test the Workflow Manually
|
||||
|
||||
Before automating anything, let's make sure the briefing works. Start a chat session:
|
||||
|
||||
```bash
|
||||
hermes
|
||||
```
|
||||
|
||||
Then enter this prompt:
|
||||
|
||||
```
|
||||
Search for the latest news about AI agents and open source LLMs.
|
||||
Summarize the top 3 stories in a concise briefing format with links.
|
||||
```
|
||||
|
||||
Hermes will search the web, read through results, and produce something like:
|
||||
|
||||
```
|
||||
☀️ Your AI Briefing — March 8, 2026
|
||||
|
||||
1. Qwen 3 Released with 235B Parameters
|
||||
Alibaba's latest open-weight model matches GPT-4.5 on several
|
||||
benchmarks while remaining fully open source.
|
||||
→ https://qwenlm.github.io/blog/qwen3/
|
||||
|
||||
2. LangChain Launches Agent Protocol Standard
|
||||
A new open standard for agent-to-agent communication gains
|
||||
adoption from 15 major frameworks in its first week.
|
||||
→ https://blog.langchain.dev/agent-protocol/
|
||||
|
||||
3. EU AI Act Enforcement Begins for General-Purpose Models
|
||||
The first compliance deadlines hit, with open source models
|
||||
receiving exemptions under the 10M parameter threshold.
|
||||
→ https://artificialintelligenceact.eu/updates/
|
||||
|
||||
---
|
||||
3 stories • Sources searched: 8 • Generated by Hermes Agent
|
||||
```
|
||||
|
||||
If this works, you're ready to automate it.
|
||||
|
||||
:::tip Iterate on the format
|
||||
Try different prompts until you get output you love. Add instructions like "use emoji headers" or "keep each summary under 2 sentences." Whatever you settle on goes into the cron job.
|
||||
:::
|
||||
|
||||
## Step 2: Create the Cron Job
|
||||
|
||||
Now let's schedule this to run automatically every morning. You can do this in two ways.
|
||||
|
||||
### Option A: Natural Language (in chat)
|
||||
|
||||
Just tell Hermes what you want:
|
||||
|
||||
```
|
||||
Every morning at 8am, search the web for the latest news about AI agents
|
||||
and open source LLMs. Summarize the top 3 stories in a concise briefing
|
||||
with links. Use a friendly, professional tone. Deliver to telegram.
|
||||
```
|
||||
|
||||
Hermes will create the cron job for you using the unified `cronjob` tool.
|
||||
|
||||
### Option B: CLI Slash Command
|
||||
|
||||
Use the `/cron` command for more control:
|
||||
|
||||
```
|
||||
/cron add "0 8 * * *" "Search the web for the latest news about AI agents and open source LLMs. Find at least 5 recent articles from the past 24 hours. Summarize the top 3 most important stories in a concise daily briefing format. For each story include: a clear headline, a 2-sentence summary, and the source URL. Use a friendly, professional tone. Format with emoji bullet points and end with a total story count."
|
||||
```
|
||||
|
||||
### The Golden Rule: Self-Contained Prompts
|
||||
|
||||
:::warning Critical concept
|
||||
Cron jobs run in a **completely fresh session** — no memory of your previous conversations, no context about what you "set up earlier." Your prompt must contain **everything** the agent needs to do the job.
|
||||
:::
|
||||
|
||||
**Bad prompt:**
|
||||
```
|
||||
Do my usual morning briefing.
|
||||
```
|
||||
|
||||
**Good prompt:**
|
||||
```
|
||||
Search the web for the latest news about AI agents and open source LLMs.
|
||||
Find at least 5 recent articles from the past 24 hours. Summarize the
|
||||
top 3 most important stories in a concise daily briefing format. For each
|
||||
story include: a clear headline, a 2-sentence summary, and the source URL.
|
||||
Use a friendly, professional tone. Format with emoji bullet points.
|
||||
```
|
||||
|
||||
The good prompt is specific about **what to search**, **how many articles**, **what format**, and **what tone**. It's everything the agent needs in one shot.
|
||||
|
||||
## Step 3: Customize the Briefing
|
||||
|
||||
Once the basic briefing works, you can get creative.
|
||||
|
||||
### Multi-Topic Briefings
|
||||
|
||||
Cover several areas in one briefing:
|
||||
|
||||
```
|
||||
/cron add "0 8 * * *" "Create a morning briefing covering three topics. For each topic, search the web for recent news from the past 24 hours and summarize the top 2 stories with links.
|
||||
|
||||
Topics:
|
||||
1. AI and machine learning — focus on open source models and agent frameworks
|
||||
2. Cryptocurrency — focus on Bitcoin, Ethereum, and regulatory news
|
||||
3. Space exploration — focus on SpaceX, NASA, and commercial space
|
||||
|
||||
Format as a clean briefing with section headers and emoji. End with today's date and a motivational quote."
|
||||
```
|
||||
|
||||
### Using Delegation for Parallel Research
|
||||
|
||||
For faster briefings, tell Hermes to delegate each topic to a sub-agent:
|
||||
|
||||
```
|
||||
/cron add "0 8 * * *" "Create a morning briefing by delegating research to sub-agents. Delegate three parallel tasks:
|
||||
|
||||
1. Delegate: Search for the top 2 AI/ML news stories from the past 24 hours with links
|
||||
2. Delegate: Search for the top 2 cryptocurrency news stories from the past 24 hours with links
|
||||
3. Delegate: Search for the top 2 space exploration news stories from the past 24 hours with links
|
||||
|
||||
Collect all results and combine them into a single clean briefing with section headers, emoji formatting, and source links. Add today's date as a header."
|
||||
```
|
||||
|
||||
Each sub-agent searches independently and in parallel, then the main agent combines everything into one polished briefing. See the [Delegation docs](/docs/user-guide/features/delegation) for more on how this works.
|
||||
|
||||
### Weekday-Only Schedule
|
||||
|
||||
Don't need briefings on weekends? Use a cron expression that targets Monday–Friday:
|
||||
|
||||
```
|
||||
/cron add "0 8 * * 1-5" "Search for the latest AI and tech news..."
|
||||
```
|
||||
|
||||
### Twice-Daily Briefings
|
||||
|
||||
Get a morning overview and an evening recap:
|
||||
|
||||
```
|
||||
/cron add "0 8 * * *" "Morning briefing: search for AI news from the past 12 hours..."
|
||||
/cron add "0 18 * * *" "Evening recap: search for AI news from the past 12 hours..."
|
||||
```
|
||||
|
||||
### Adding Personal Context with Memory
|
||||
|
||||
If you have [memory](/docs/user-guide/features/memory) enabled, you can store preferences that persist across sessions. But remember — cron jobs run in fresh sessions without conversational memory. To add personal context, bake it directly into the prompt:
|
||||
|
||||
```
|
||||
/cron add "0 8 * * *" "You are creating a briefing for a senior ML engineer who cares about: PyTorch ecosystem, transformer architectures, open-weight models, and AI regulation in the EU. Skip stories about product launches or funding rounds unless they involve open source.
|
||||
|
||||
Search for the latest news on these topics. Summarize the top 3 stories with links. Be concise and technical — this reader doesn't need basic explanations."
|
||||
```
|
||||
|
||||
:::tip Tailor the persona
|
||||
Including details about who the briefing is *for* dramatically improves relevance. Tell the agent your role, interests, and what to skip.
|
||||
:::
|
||||
|
||||
## Step 4: Manage Your Jobs
|
||||
|
||||
### List All Scheduled Jobs
|
||||
|
||||
In chat:
|
||||
```
|
||||
/cron list
|
||||
```
|
||||
|
||||
Or from the terminal:
|
||||
```bash
|
||||
hermes cron list
|
||||
```
|
||||
|
||||
You'll see output like:
|
||||
|
||||
```
|
||||
ID | Name | Schedule | Next Run | Deliver
|
||||
------------|-------------------|-------------|--------------------|--------
|
||||
a1b2c3d4 | Morning Briefing | 0 8 * * * | 2026-03-09 08:00 | telegram
|
||||
e5f6g7h8 | Evening Recap | 0 18 * * * | 2026-03-08 18:00 | telegram
|
||||
```
|
||||
|
||||
### Remove a Job
|
||||
|
||||
In chat:
|
||||
```
|
||||
/cron remove a1b2c3d4
|
||||
```
|
||||
|
||||
Or ask conversationally:
|
||||
```
|
||||
Remove my morning briefing cron job.
|
||||
```
|
||||
|
||||
Hermes will use `cronjob(action="list")` to find it and `cronjob(action="remove")` to delete it.
|
||||
|
||||
### Check Gateway Status
|
||||
|
||||
Make sure the scheduler is actually running:
|
||||
|
||||
```bash
|
||||
hermes cron status
|
||||
```
|
||||
|
||||
If the gateway isn't running, your jobs won't execute. Install it as a background service for reliability:
|
||||
|
||||
```bash
|
||||
hermes gateway install
|
||||
# or on Linux servers
|
||||
sudo hermes gateway install --system
|
||||
```
|
||||
|
||||
## Going Further
|
||||
|
||||
You've built a working daily briefing bot. Here are some directions to explore next:
|
||||
|
||||
- **[Scheduled Tasks (Cron)](/docs/user-guide/features/cron)** — Full reference for schedule formats, repeat limits, and delivery options
|
||||
- **[Delegation](/docs/user-guide/features/delegation)** — Deep dive into parallel sub-agent workflows
|
||||
- **[Messaging Platforms](/docs/user-guide/messaging)** — Set up Telegram, Discord, or other delivery targets
|
||||
- **[Memory](/docs/user-guide/features/memory)** — Persistent context across sessions
|
||||
- **[Tips & Best Practices](/docs/guides/tips)** — More prompt engineering advice
|
||||
|
||||
:::tip What else can you schedule?
|
||||
The briefing bot pattern works for anything: competitor monitoring, GitHub repo summaries, weather forecasts, portfolio tracking, server health checks, or even a daily joke. If you can describe it in a prompt, you can schedule it.
|
||||
:::
|
||||
340
hermes_code/website/docs/guides/python-library.md
Normal file
340
hermes_code/website/docs/guides/python-library.md
Normal file
|
|
@ -0,0 +1,340 @@
|
|||
---
|
||||
sidebar_position: 4
|
||||
title: "Using Hermes as a Python Library"
|
||||
description: "Embed AIAgent in your own Python scripts, web apps, or automation pipelines — no CLI required"
|
||||
---
|
||||
|
||||
# Using Hermes as a Python Library
|
||||
|
||||
Hermes isn't just a CLI tool. You can import `AIAgent` directly and use it programmatically in your own Python scripts, web applications, or automation pipelines. This guide shows you how.
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
Install Hermes directly from the repository:
|
||||
|
||||
```bash
|
||||
pip install git+https://github.com/NousResearch/hermes-agent.git
|
||||
```
|
||||
|
||||
Or with [uv](https://docs.astral.sh/uv/):
|
||||
|
||||
```bash
|
||||
uv pip install git+https://github.com/NousResearch/hermes-agent.git
|
||||
```
|
||||
|
||||
You can also pin it in your `requirements.txt`:
|
||||
|
||||
```text
|
||||
hermes-agent @ git+https://github.com/NousResearch/hermes-agent.git
|
||||
```
|
||||
|
||||
:::tip
|
||||
The same environment variables used by the CLI are required when using Hermes as a library. At minimum, set `OPENROUTER_API_KEY` (or `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` if using direct provider access).
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Basic Usage
|
||||
|
||||
The simplest way to use Hermes is the `chat()` method — pass a message, get a string back:
|
||||
|
||||
```python
|
||||
from run_agent import AIAgent
|
||||
|
||||
agent = AIAgent(
|
||||
model="anthropic/claude-sonnet-4",
|
||||
quiet_mode=True,
|
||||
)
|
||||
response = agent.chat("What is the capital of France?")
|
||||
print(response)
|
||||
```
|
||||
|
||||
`chat()` handles the full conversation loop internally — tool calls, retries, everything — and returns just the final text response.
|
||||
|
||||
:::warning
|
||||
Always set `quiet_mode=True` when embedding Hermes in your own code. Without it, the agent prints CLI spinners, progress indicators, and other terminal output that will clutter your application's output.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Full Conversation Control
|
||||
|
||||
For more control over the conversation, use `run_conversation()` directly. It returns a dictionary with the full response, message history, and metadata:
|
||||
|
||||
```python
|
||||
agent = AIAgent(
|
||||
model="anthropic/claude-sonnet-4",
|
||||
quiet_mode=True,
|
||||
)
|
||||
|
||||
result = agent.run_conversation(
|
||||
user_message="Search for recent Python 3.13 features",
|
||||
task_id="my-task-1",
|
||||
)
|
||||
|
||||
print(result["final_response"])
|
||||
print(f"Messages exchanged: {len(result['messages'])}")
|
||||
```
|
||||
|
||||
The returned dictionary contains:
|
||||
- **`final_response`** — The agent's final text reply
|
||||
- **`messages`** — The complete message history (system, user, assistant, tool calls)
|
||||
- **`task_id`** — The task identifier used for VM isolation
|
||||
|
||||
You can also pass a custom system message that overrides the ephemeral system prompt for that call:
|
||||
|
||||
```python
|
||||
result = agent.run_conversation(
|
||||
user_message="Explain quicksort",
|
||||
system_message="You are a computer science tutor. Use simple analogies.",
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuring Tools
|
||||
|
||||
Control which toolsets the agent has access to using `enabled_toolsets` or `disabled_toolsets`:
|
||||
|
||||
```python
|
||||
# Only enable web tools (browsing, search)
|
||||
agent = AIAgent(
|
||||
model="anthropic/claude-sonnet-4",
|
||||
enabled_toolsets=["web"],
|
||||
quiet_mode=True,
|
||||
)
|
||||
|
||||
# Enable everything except terminal access
|
||||
agent = AIAgent(
|
||||
model="anthropic/claude-sonnet-4",
|
||||
disabled_toolsets=["terminal"],
|
||||
quiet_mode=True,
|
||||
)
|
||||
```
|
||||
|
||||
:::tip
|
||||
Use `enabled_toolsets` when you want a minimal, locked-down agent (e.g., only web search for a research bot). Use `disabled_toolsets` when you want most capabilities but need to restrict specific ones (e.g., no terminal access in a shared environment).
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Multi-turn Conversations
|
||||
|
||||
Maintain conversation state across multiple turns by passing the message history back in:
|
||||
|
||||
```python
|
||||
agent = AIAgent(
|
||||
model="anthropic/claude-sonnet-4",
|
||||
quiet_mode=True,
|
||||
)
|
||||
|
||||
# First turn
|
||||
result1 = agent.run_conversation("My name is Alice")
|
||||
history = result1["messages"]
|
||||
|
||||
# Second turn — agent remembers the context
|
||||
result2 = agent.run_conversation(
|
||||
"What's my name?",
|
||||
conversation_history=history,
|
||||
)
|
||||
print(result2["final_response"]) # "Your name is Alice."
|
||||
```
|
||||
|
||||
The `conversation_history` parameter accepts the `messages` list from a previous result. The agent copies it internally, so your original list is never mutated.
|
||||
|
||||
---
|
||||
|
||||
## Saving Trajectories
|
||||
|
||||
Enable trajectory saving to capture conversations in ShareGPT format — useful for generating training data or debugging:
|
||||
|
||||
```python
|
||||
agent = AIAgent(
|
||||
model="anthropic/claude-sonnet-4",
|
||||
save_trajectories=True,
|
||||
quiet_mode=True,
|
||||
)
|
||||
|
||||
agent.chat("Write a Python function to sort a list")
|
||||
# Saves to trajectory_samples.jsonl in ShareGPT format
|
||||
```
|
||||
|
||||
Each conversation is appended as a single JSONL line, making it easy to collect datasets from automated runs.
|
||||
|
||||
---
|
||||
|
||||
## Custom System Prompts
|
||||
|
||||
Use `ephemeral_system_prompt` to set a custom system prompt that guides the agent's behavior but is **not** saved to trajectory files (keeping your training data clean):
|
||||
|
||||
```python
|
||||
agent = AIAgent(
|
||||
model="anthropic/claude-sonnet-4",
|
||||
ephemeral_system_prompt="You are a SQL expert. Only answer database questions.",
|
||||
quiet_mode=True,
|
||||
)
|
||||
|
||||
response = agent.chat("How do I write a JOIN query?")
|
||||
print(response)
|
||||
```
|
||||
|
||||
This is ideal for building specialized agents — a code reviewer, a documentation writer, a SQL assistant — all using the same underlying tooling.
|
||||
|
||||
---
|
||||
|
||||
## Batch Processing
|
||||
|
||||
For running many prompts in parallel, Hermes includes `batch_runner.py`. It manages concurrent `AIAgent` instances with proper resource isolation:
|
||||
|
||||
```bash
|
||||
python batch_runner.py --input prompts.jsonl --output results.jsonl
|
||||
```
|
||||
|
||||
Each prompt gets its own `task_id` and isolated environment. If you need custom batch logic, you can build your own using `AIAgent` directly:
|
||||
|
||||
```python
|
||||
import concurrent.futures
|
||||
from run_agent import AIAgent
|
||||
|
||||
prompts = [
|
||||
"Explain recursion",
|
||||
"What is a hash table?",
|
||||
"How does garbage collection work?",
|
||||
]
|
||||
|
||||
def process_prompt(prompt):
|
||||
# Create a fresh agent per task for thread safety
|
||||
agent = AIAgent(
|
||||
model="anthropic/claude-sonnet-4",
|
||||
quiet_mode=True,
|
||||
skip_memory=True,
|
||||
)
|
||||
return agent.chat(prompt)
|
||||
|
||||
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
|
||||
results = list(executor.map(process_prompt, prompts))
|
||||
|
||||
for prompt, result in zip(prompts, results):
|
||||
print(f"Q: {prompt}\nA: {result}\n")
|
||||
```
|
||||
|
||||
:::warning
|
||||
Always create a **new `AIAgent` instance per thread or task**. The agent maintains internal state (conversation history, tool sessions, iteration counters) that is not thread-safe to share.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Integration Examples
|
||||
|
||||
### FastAPI Endpoint
|
||||
|
||||
```python
|
||||
from fastapi import FastAPI
|
||||
from pydantic import BaseModel
|
||||
from run_agent import AIAgent
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
class ChatRequest(BaseModel):
|
||||
message: str
|
||||
model: str = "anthropic/claude-sonnet-4"
|
||||
|
||||
@app.post("/chat")
|
||||
async def chat(request: ChatRequest):
|
||||
agent = AIAgent(
|
||||
model=request.model,
|
||||
quiet_mode=True,
|
||||
skip_context_files=True,
|
||||
skip_memory=True,
|
||||
)
|
||||
response = agent.chat(request.message)
|
||||
return {"response": response}
|
||||
```
|
||||
|
||||
### Discord Bot
|
||||
|
||||
```python
|
||||
import discord
|
||||
from run_agent import AIAgent
|
||||
|
||||
client = discord.Client(intents=discord.Intents.default())
|
||||
|
||||
@client.event
|
||||
async def on_message(message):
|
||||
if message.author == client.user:
|
||||
return
|
||||
if message.content.startswith("!hermes "):
|
||||
query = message.content[8:]
|
||||
agent = AIAgent(
|
||||
model="anthropic/claude-sonnet-4",
|
||||
quiet_mode=True,
|
||||
skip_context_files=True,
|
||||
skip_memory=True,
|
||||
platform="discord",
|
||||
)
|
||||
response = agent.chat(query)
|
||||
await message.channel.send(response[:2000])
|
||||
|
||||
client.run("YOUR_DISCORD_TOKEN")
|
||||
```
|
||||
|
||||
### CI/CD Pipeline Step
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
"""CI step: auto-review a PR diff."""
|
||||
import subprocess
|
||||
from run_agent import AIAgent
|
||||
|
||||
diff = subprocess.check_output(["git", "diff", "main...HEAD"]).decode()
|
||||
|
||||
agent = AIAgent(
|
||||
model="anthropic/claude-sonnet-4",
|
||||
quiet_mode=True,
|
||||
skip_context_files=True,
|
||||
skip_memory=True,
|
||||
disabled_toolsets=["terminal", "browser"],
|
||||
)
|
||||
|
||||
review = agent.chat(
|
||||
f"Review this PR diff for bugs, security issues, and style problems:\n\n{diff}"
|
||||
)
|
||||
print(review)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Constructor Parameters
|
||||
|
||||
| Parameter | Type | Default | Description |
|
||||
|-----------|------|---------|-------------|
|
||||
| `model` | `str` | `"anthropic/claude-opus-4.6"` | Model in OpenRouter format |
|
||||
| `quiet_mode` | `bool` | `False` | Suppress CLI output |
|
||||
| `enabled_toolsets` | `List[str]` | `None` | Whitelist specific toolsets |
|
||||
| `disabled_toolsets` | `List[str]` | `None` | Blacklist specific toolsets |
|
||||
| `save_trajectories` | `bool` | `False` | Save conversations to JSONL |
|
||||
| `ephemeral_system_prompt` | `str` | `None` | Custom system prompt (not saved to trajectories) |
|
||||
| `max_iterations` | `int` | `90` | Max tool-calling iterations per conversation |
|
||||
| `skip_context_files` | `bool` | `False` | Skip loading AGENTS.md files |
|
||||
| `skip_memory` | `bool` | `False` | Disable persistent memory read/write |
|
||||
| `api_key` | `str` | `None` | API key (falls back to env vars) |
|
||||
| `base_url` | `str` | `None` | Custom API endpoint URL |
|
||||
| `platform` | `str` | `None` | Platform hint (`"discord"`, `"telegram"`, etc.) |
|
||||
|
||||
---
|
||||
|
||||
## Important Notes
|
||||
|
||||
:::tip
|
||||
- Set **`skip_context_files=True`** if you don't want `AGENTS.md` files from the working directory loaded into the system prompt.
|
||||
- Set **`skip_memory=True`** to prevent the agent from reading or writing persistent memory — recommended for stateless API endpoints.
|
||||
- The `platform` parameter (e.g., `"discord"`, `"telegram"`) injects platform-specific formatting hints so the agent adapts its output style.
|
||||
:::
|
||||
|
||||
:::warning
|
||||
- **Thread safety**: Create one `AIAgent` per thread or task. Never share an instance across concurrent calls.
|
||||
- **Resource cleanup**: The agent automatically cleans up resources (terminal sessions, browser instances) when a conversation ends. If you're running in a long-lived process, ensure each conversation completes normally.
|
||||
- **Iteration limits**: The default `max_iterations=90` is generous. For simple Q&A use cases, consider lowering it (e.g., `max_iterations=10`) to prevent runaway tool-calling loops and control costs.
|
||||
:::
|
||||
437
hermes_code/website/docs/guides/team-telegram-assistant.md
Normal file
437
hermes_code/website/docs/guides/team-telegram-assistant.md
Normal file
|
|
@ -0,0 +1,437 @@
|
|||
---
|
||||
sidebar_position: 3
|
||||
title: "Tutorial: Team Telegram Assistant"
|
||||
description: "Step-by-step guide to setting up a Telegram bot that your whole team can use for code help, research, system admin, and more"
|
||||
---
|
||||
|
||||
# Set Up a Team Telegram Assistant
|
||||
|
||||
This tutorial walks you through setting up a Telegram bot powered by Hermes Agent that multiple team members can use. By the end, your team will have a shared AI assistant they can message for help with code, research, system administration, and anything else — secured with per-user authorization.
|
||||
|
||||
## What We're Building
|
||||
|
||||
A Telegram bot that:
|
||||
|
||||
- **Any authorized team member** can DM for help — code reviews, research, shell commands, debugging
|
||||
- **Runs on your server** with full tool access — terminal, file editing, web search, code execution
|
||||
- **Per-user sessions** — each person gets their own conversation context
|
||||
- **Secure by default** — only approved users can interact, with two authorization methods
|
||||
- **Scheduled tasks** — daily standups, health checks, and reminders delivered to a team channel
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before starting, make sure you have:
|
||||
|
||||
- **Hermes Agent installed** on a server or VPS (not your laptop — the bot needs to stay running). Follow the [installation guide](/getting-started/learning-path) if you haven't yet.
|
||||
- **A Telegram account** for yourself (the bot owner)
|
||||
- **An LLM provider configured** — at minimum, an API key for OpenAI, Anthropic, or another supported provider in `~/.hermes/.env`
|
||||
|
||||
:::tip
|
||||
A $5/month VPS is plenty for running the gateway. Hermes itself is lightweight — the LLM API calls are what cost money, and those happen remotely.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Create a Telegram Bot
|
||||
|
||||
Every Telegram bot starts with **@BotFather** — Telegram's official bot for creating bots.
|
||||
|
||||
1. **Open Telegram** and search for `@BotFather`, or go to [t.me/BotFather](https://t.me/BotFather)
|
||||
|
||||
2. **Send `/newbot`** — BotFather will ask you two things:
|
||||
- **Display name** — what users see (e.g., `Team Hermes Assistant`)
|
||||
- **Username** — must end in `bot` (e.g., `myteam_hermes_bot`)
|
||||
|
||||
3. **Copy the bot token** — BotFather replies with something like:
|
||||
```
|
||||
Use this token to access the HTTP API:
|
||||
7123456789:AAH1bGciOiJSUzI1NiIsInR5cCI6Ikp...
|
||||
```
|
||||
Save this token — you'll need it in the next step.
|
||||
|
||||
4. **Set a description** (optional but recommended):
|
||||
```
|
||||
/setdescription
|
||||
```
|
||||
Choose your bot, then enter something like:
|
||||
```
|
||||
Team AI assistant powered by Hermes Agent. DM me for help with code, research, debugging, and more.
|
||||
```
|
||||
|
||||
5. **Set bot commands** (optional — gives users a command menu):
|
||||
```
|
||||
/setcommands
|
||||
```
|
||||
Choose your bot, then paste:
|
||||
```
|
||||
new - Start a fresh conversation
|
||||
model - Show or change the AI model
|
||||
status - Show session info
|
||||
help - Show available commands
|
||||
stop - Stop the current task
|
||||
```
|
||||
|
||||
:::warning
|
||||
Keep your bot token secret. Anyone with the token can control the bot. If it leaks, use `/revoke` in BotFather to generate a new one.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Configure the Gateway
|
||||
|
||||
You have two options: the interactive setup wizard (recommended) or manual configuration.
|
||||
|
||||
### Option A: Interactive Setup (Recommended)
|
||||
|
||||
```bash
|
||||
hermes gateway setup
|
||||
```
|
||||
|
||||
This walks you through everything with arrow-key selection. Pick **Telegram**, paste your bot token, and enter your user ID when prompted.
|
||||
|
||||
### Option B: Manual Configuration
|
||||
|
||||
Add these lines to `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
# Telegram bot token from BotFather
|
||||
TELEGRAM_BOT_TOKEN=7123456789:AAH1bGciOiJSUzI1NiIsInR5cCI6Ikp...
|
||||
|
||||
# Your Telegram user ID (numeric)
|
||||
TELEGRAM_ALLOWED_USERS=123456789
|
||||
```
|
||||
|
||||
### Finding Your User ID
|
||||
|
||||
Your Telegram user ID is a numeric value (not your username). To find it:
|
||||
|
||||
1. Message [@userinfobot](https://t.me/userinfobot) on Telegram
|
||||
2. It instantly replies with your numeric user ID
|
||||
3. Copy that number into `TELEGRAM_ALLOWED_USERS`
|
||||
|
||||
:::info
|
||||
Telegram user IDs are permanent numbers like `123456789`. They're different from your `@username`, which can change. Always use the numeric ID for allowlists.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Start the Gateway
|
||||
|
||||
### Quick Test
|
||||
|
||||
Run the gateway in the foreground first to make sure everything works:
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
You should see output like:
|
||||
|
||||
```
|
||||
[Gateway] Starting Hermes Gateway...
|
||||
[Gateway] Telegram adapter connected
|
||||
[Gateway] Cron scheduler started (tick every 60s)
|
||||
```
|
||||
|
||||
Open Telegram, find your bot, and send it a message. If it replies, you're in business. Press `Ctrl+C` to stop.
|
||||
|
||||
### Production: Install as a Service
|
||||
|
||||
For a persistent deployment that survives reboots:
|
||||
|
||||
```bash
|
||||
hermes gateway install
|
||||
sudo hermes gateway install --system # Linux only: boot-time system service
|
||||
```
|
||||
|
||||
This creates a background service: a user-level **systemd** service on Linux by default, a **launchd** service on macOS, or a boot-time Linux system service if you pass `--system`.
|
||||
|
||||
```bash
|
||||
# Linux — manage the default user service
|
||||
hermes gateway start
|
||||
hermes gateway stop
|
||||
hermes gateway status
|
||||
|
||||
# View live logs
|
||||
journalctl --user -u hermes-gateway -f
|
||||
|
||||
# Keep running after SSH logout
|
||||
sudo loginctl enable-linger $USER
|
||||
|
||||
# Linux servers — explicit system-service commands
|
||||
sudo hermes gateway start --system
|
||||
sudo hermes gateway status --system
|
||||
journalctl -u hermes-gateway -f
|
||||
```
|
||||
|
||||
```bash
|
||||
# macOS — manage the service
|
||||
launchctl start ai.hermes.gateway
|
||||
launchctl stop ai.hermes.gateway
|
||||
tail -f ~/.hermes/logs/gateway.log
|
||||
```
|
||||
|
||||
### Verify It's Running
|
||||
|
||||
```bash
|
||||
hermes gateway status
|
||||
```
|
||||
|
||||
Then send a test message to your bot on Telegram. You should get a response within a few seconds.
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Set Up Team Access
|
||||
|
||||
Now let's give your teammates access. There are two approaches.
|
||||
|
||||
### Approach A: Static Allowlist
|
||||
|
||||
Collect each team member's Telegram user ID (have them message [@userinfobot](https://t.me/userinfobot)) and add them as a comma-separated list:
|
||||
|
||||
```bash
|
||||
# In ~/.hermes/.env
|
||||
TELEGRAM_ALLOWED_USERS=123456789,987654321,555555555
|
||||
```
|
||||
|
||||
Restart the gateway after changes:
|
||||
|
||||
```bash
|
||||
hermes gateway stop && hermes gateway start
|
||||
```
|
||||
|
||||
### Approach B: DM Pairing (Recommended for Teams)
|
||||
|
||||
DM pairing is more flexible — you don't need to collect user IDs upfront. Here's how it works:
|
||||
|
||||
1. **Teammate DMs the bot** — since they're not on the allowlist, the bot replies with a one-time pairing code:
|
||||
```
|
||||
🔐 Pairing code: XKGH5N7P
|
||||
Send this code to the bot owner for approval.
|
||||
```
|
||||
|
||||
2. **Teammate sends you the code** (via any channel — Slack, email, in person)
|
||||
|
||||
3. **You approve it** on the server:
|
||||
```bash
|
||||
hermes pairing approve telegram XKGH5N7P
|
||||
```
|
||||
|
||||
4. **They're in** — the bot immediately starts responding to their messages
|
||||
|
||||
**Managing paired users:**
|
||||
|
||||
```bash
|
||||
# See all pending and approved users
|
||||
hermes pairing list
|
||||
|
||||
# Revoke someone's access
|
||||
hermes pairing revoke telegram 987654321
|
||||
|
||||
# Clear expired pending codes
|
||||
hermes pairing clear-pending
|
||||
```
|
||||
|
||||
:::tip
|
||||
DM pairing is ideal for teams because you don't need to restart the gateway when adding new users. Approvals take effect immediately.
|
||||
:::
|
||||
|
||||
### Security Considerations
|
||||
|
||||
- **Never set `GATEWAY_ALLOW_ALL_USERS=true`** on a bot with terminal access — anyone who finds your bot could run commands on your server
|
||||
- Pairing codes expire after **1 hour** and use cryptographic randomness
|
||||
- Rate limiting prevents brute-force attacks: 1 request per user per 10 minutes, max 3 pending codes per platform
|
||||
- After 5 failed approval attempts, the platform enters a 1-hour lockout
|
||||
- All pairing data is stored with `chmod 0600` permissions
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Configure the Bot
|
||||
|
||||
### Set a Home Channel
|
||||
|
||||
A **home channel** is where the bot delivers cron job results and proactive messages. Without one, scheduled tasks have nowhere to send output.
|
||||
|
||||
**Option 1:** Use the `/sethome` command in any Telegram group or chat where the bot is a member.
|
||||
|
||||
**Option 2:** Set it manually in `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
TELEGRAM_HOME_CHANNEL=-1001234567890
|
||||
TELEGRAM_HOME_CHANNEL_NAME="Team Updates"
|
||||
```
|
||||
|
||||
To find a channel ID, add [@userinfobot](https://t.me/userinfobot) to the group — it will report the group's chat ID.
|
||||
|
||||
### Configure Tool Progress Display
|
||||
|
||||
Control how much detail the bot shows when using tools. In `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
display:
|
||||
tool_progress: new # off | new | all | verbose
|
||||
```
|
||||
|
||||
| Mode | What You See |
|
||||
|------|-------------|
|
||||
| `off` | Clean responses only — no tool activity |
|
||||
| `new` | Brief status for each new tool call (recommended for messaging) |
|
||||
| `all` | Every tool call with details |
|
||||
| `verbose` | Full tool output including command results |
|
||||
|
||||
Users can also change this per-session with the `/verbose` command in chat.
|
||||
|
||||
### Set Up a Personality with SOUL.md
|
||||
|
||||
Customize how the bot communicates by editing `~/.hermes/SOUL.md`:
|
||||
|
||||
For a full guide, see [Use SOUL.md with Hermes](/docs/guides/use-soul-with-hermes).
|
||||
|
||||
```markdown
|
||||
# Soul
|
||||
You are a helpful team assistant. Be concise and technical.
|
||||
Use code blocks for any code. Skip pleasantries — the team
|
||||
values directness. When debugging, always ask for error logs
|
||||
before guessing at solutions.
|
||||
```
|
||||
|
||||
### Add Project Context
|
||||
|
||||
If your team works on specific projects, create context files so the bot knows your stack:
|
||||
|
||||
```markdown
|
||||
<!-- ~/.hermes/AGENTS.md -->
|
||||
# Team Context
|
||||
- We use Python 3.12 with FastAPI and SQLAlchemy
|
||||
- Frontend is React with TypeScript
|
||||
- CI/CD runs on GitHub Actions
|
||||
- Production deploys to AWS ECS
|
||||
- Always suggest writing tests for new code
|
||||
```
|
||||
|
||||
:::info
|
||||
Context files are injected into every session's system prompt. Keep them concise — every character counts against your token budget.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Set Up Scheduled Tasks
|
||||
|
||||
With the gateway running, you can schedule recurring tasks that deliver results to your team channel.
|
||||
|
||||
### Daily Standup Summary
|
||||
|
||||
Message the bot on Telegram:
|
||||
|
||||
```
|
||||
Every weekday at 9am, check the GitHub repository at
|
||||
github.com/myorg/myproject for:
|
||||
1. Pull requests opened/merged in the last 24 hours
|
||||
2. Issues created or closed
|
||||
3. Any CI/CD failures on the main branch
|
||||
Format as a brief standup-style summary.
|
||||
```
|
||||
|
||||
The agent creates a cron job automatically and delivers results to the chat where you asked (or the home channel).
|
||||
|
||||
### Server Health Check
|
||||
|
||||
```
|
||||
Every 6 hours, check disk usage with 'df -h', memory with 'free -h',
|
||||
and Docker container status with 'docker ps'. Report anything unusual —
|
||||
partitions above 80%, containers that have restarted, or high memory usage.
|
||||
```
|
||||
|
||||
### Managing Scheduled Tasks
|
||||
|
||||
```bash
|
||||
# From the CLI
|
||||
hermes cron list # View all scheduled jobs
|
||||
hermes cron status # Check if scheduler is running
|
||||
|
||||
# From Telegram chat
|
||||
/cron list # View jobs
|
||||
/cron remove <job_id> # Remove a job
|
||||
```
|
||||
|
||||
:::warning
|
||||
Cron job prompts run in completely fresh sessions with no memory of prior conversations. Make sure each prompt contains **all** the context the agent needs — file paths, URLs, server addresses, and clear instructions.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Production Tips
|
||||
|
||||
### Use Docker for Safety
|
||||
|
||||
On a shared team bot, use Docker as the terminal backend so agent commands run in a container instead of on your host:
|
||||
|
||||
```bash
|
||||
# In ~/.hermes/.env
|
||||
TERMINAL_BACKEND=docker
|
||||
TERMINAL_DOCKER_IMAGE=nikolaik/python-nodejs:python3.11-nodejs20
|
||||
```
|
||||
|
||||
Or in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: docker
|
||||
container_cpu: 1
|
||||
container_memory: 5120
|
||||
container_persistent: true
|
||||
```
|
||||
|
||||
This way, even if someone asks the bot to run something destructive, your host system is protected.
|
||||
|
||||
### Monitor the Gateway
|
||||
|
||||
```bash
|
||||
# Check if the gateway is running
|
||||
hermes gateway status
|
||||
|
||||
# Watch live logs (Linux)
|
||||
journalctl --user -u hermes-gateway -f
|
||||
|
||||
# Watch live logs (macOS)
|
||||
tail -f ~/.hermes/logs/gateway.log
|
||||
```
|
||||
|
||||
### Keep Hermes Updated
|
||||
|
||||
From Telegram, send `/update` to the bot — it will pull the latest version and restart. Or from the server:
|
||||
|
||||
```bash
|
||||
hermes update
|
||||
hermes gateway stop && hermes gateway start
|
||||
```
|
||||
|
||||
### Log Locations
|
||||
|
||||
| What | Location |
|
||||
|------|----------|
|
||||
| Gateway logs | `journalctl --user -u hermes-gateway` (Linux) or `~/.hermes/logs/gateway.log` (macOS) |
|
||||
| Cron job output | `~/.hermes/cron/output/{job_id}/{timestamp}.md` |
|
||||
| Cron job definitions | `~/.hermes/cron/jobs.json` |
|
||||
| Pairing data | `~/.hermes/pairing/` |
|
||||
| Session history | `~/.hermes/sessions/` |
|
||||
|
||||
---
|
||||
|
||||
## Going Further
|
||||
|
||||
You've got a working team Telegram assistant. Here are some next steps:
|
||||
|
||||
- **[Security Guide](/user-guide/security)** — deep dive into authorization, container isolation, and command approval
|
||||
- **[Messaging Gateway](/user-guide/messaging)** — full reference for gateway architecture, session management, and chat commands
|
||||
- **[Telegram Setup](/user-guide/messaging/telegram)** — platform-specific details including voice messages and TTS
|
||||
- **[Scheduled Tasks](/user-guide/features/cron)** — advanced cron scheduling with delivery options and cron expressions
|
||||
- **[Context Files](/user-guide/features/context-files)** — AGENTS.md, SOUL.md, and .cursorrules for project knowledge
|
||||
- **[Personality](/user-guide/features/personality)** — built-in personality presets and custom persona definitions
|
||||
- **Add more platforms** — the same gateway can simultaneously run [Discord](/user-guide/messaging/discord), [Slack](/user-guide/messaging/slack), and [WhatsApp](/user-guide/messaging/whatsapp)
|
||||
|
||||
---
|
||||
|
||||
*Questions or issues? Open an issue on GitHub — contributions are welcome.*
|
||||
234
hermes_code/website/docs/guides/tips.md
Normal file
234
hermes_code/website/docs/guides/tips.md
Normal file
|
|
@ -0,0 +1,234 @@
|
|||
---
|
||||
sidebar_position: 1
|
||||
title: "Tips & Best Practices"
|
||||
description: "Practical advice to get the most out of Hermes Agent — prompt tips, CLI shortcuts, context files, memory, cost optimization, and security"
|
||||
---
|
||||
|
||||
# Tips & Best Practices
|
||||
|
||||
A quick-wins collection of practical tips that make you immediately more effective with Hermes Agent. Each section targets a different aspect — scan the headers and jump to what's relevant.
|
||||
|
||||
---
|
||||
|
||||
## Getting the Best Results
|
||||
|
||||
### Be Specific About What You Want
|
||||
|
||||
Vague prompts produce vague results. Instead of "fix the code," say "fix the TypeError in `api/handlers.py` on line 47 — the `process_request()` function receives `None` from `parse_body()`." The more context you give, the fewer iterations you need.
|
||||
|
||||
### Provide Context Up Front
|
||||
|
||||
Front-load your request with the relevant details: file paths, error messages, expected behavior. One well-crafted message beats three rounds of clarification. Paste error tracebacks directly — the agent can parse them.
|
||||
|
||||
### Use Context Files for Recurring Instructions
|
||||
|
||||
If you find yourself repeating the same instructions ("use tabs not spaces," "we use pytest," "the API is at `/api/v2`"), put them in an `AGENTS.md` file. The agent reads it automatically every session — zero effort after setup.
|
||||
|
||||
### Let the Agent Use Its Tools
|
||||
|
||||
Don't try to hand-hold every step. Say "find and fix the failing test" rather than "open `tests/test_foo.py`, look at line 42, then..." The agent has file search, terminal access, and code execution — let it explore and iterate.
|
||||
|
||||
### Use Skills for Complex Workflows
|
||||
|
||||
Before writing a long prompt explaining how to do something, check if there's already a skill for it. Type `/skills` to browse available skills, or just invoke one directly like `/axolotl` or `/github-pr-workflow`.
|
||||
|
||||
## CLI Power User Tips
|
||||
|
||||
### Multi-Line Input
|
||||
|
||||
Press **Alt+Enter** (or **Ctrl+J**) to insert a newline without sending. This lets you compose multi-line prompts, paste code blocks, or structure complex requests before hitting Enter to send.
|
||||
|
||||
### Paste Detection
|
||||
|
||||
The CLI auto-detects multi-line pastes. Just paste a code block or error traceback directly — it won't send each line as a separate message. The paste is buffered and sent as one message.
|
||||
|
||||
### Interrupt and Redirect
|
||||
|
||||
Press **Ctrl+C** once to interrupt the agent mid-response. You can then type a new message to redirect it. Double-press Ctrl+C within 2 seconds to force exit. This is invaluable when the agent starts going down the wrong path.
|
||||
|
||||
### Resume Sessions with `-c`
|
||||
|
||||
Forgot something from your last session? Run `hermes -c` to resume exactly where you left off, with full conversation history restored. You can also resume by title: `hermes -r "my research project"`.
|
||||
|
||||
### Clipboard Image Paste
|
||||
|
||||
Press **Ctrl+V** to paste an image from your clipboard directly into the chat. The agent uses vision to analyze screenshots, diagrams, error popups, or UI mockups — no need to save to a file first.
|
||||
|
||||
### Slash Command Autocomplete
|
||||
|
||||
Type `/` and press **Tab** to see all available commands. This includes built-in commands (`/compress`, `/model`, `/title`) and every installed skill. You don't need to memorize anything — Tab completion has you covered.
|
||||
|
||||
:::tip
|
||||
Use `/verbose` to cycle through tool output display modes: **off → new → all → verbose**. The "all" mode is great for watching what the agent does; "off" is cleanest for simple Q&A.
|
||||
:::
|
||||
|
||||
## Context Files
|
||||
|
||||
### AGENTS.md: Your Project's Brain
|
||||
|
||||
Create an `AGENTS.md` in your project root with architecture decisions, coding conventions, and project-specific instructions. This is automatically injected into every session, so the agent always knows your project's rules.
|
||||
|
||||
```markdown
|
||||
# Project Context
|
||||
- This is a FastAPI backend with SQLAlchemy ORM
|
||||
- Always use async/await for database operations
|
||||
- Tests go in tests/ and use pytest-asyncio
|
||||
- Never commit .env files
|
||||
```
|
||||
|
||||
### SOUL.md: Customize Personality
|
||||
|
||||
Want Hermes to have a stable default voice? Edit `~/.hermes/SOUL.md` (or `$HERMES_HOME/SOUL.md` if you use a custom Hermes home). Hermes now seeds a starter SOUL automatically and uses that global file as the instance-wide personality source.
|
||||
|
||||
For a full walkthrough, see [Use SOUL.md with Hermes](/docs/guides/use-soul-with-hermes).
|
||||
|
||||
```markdown
|
||||
# Soul
|
||||
You are a senior backend engineer. Be terse and direct.
|
||||
Skip explanations unless asked. Prefer one-liners over verbose solutions.
|
||||
Always consider error handling and edge cases.
|
||||
```
|
||||
|
||||
Use `SOUL.md` for durable personality. Use `AGENTS.md` for project-specific instructions.
|
||||
|
||||
### .cursorrules Compatibility
|
||||
|
||||
Already have a `.cursorrules` or `.cursor/rules/*.mdc` file? Hermes reads those too. No need to duplicate your coding conventions — they're loaded automatically from the working directory.
|
||||
|
||||
### Hierarchical Discovery
|
||||
|
||||
Hermes walks the directory tree and discovers **all** `AGENTS.md` files at every level. In a monorepo, put project-wide conventions at the root and team-specific ones in subdirectories — they're all concatenated together with path headers.
|
||||
|
||||
:::tip
|
||||
Keep context files focused and concise. Every character counts against your token budget since they're injected into every single message.
|
||||
:::
|
||||
|
||||
## Memory & Skills
|
||||
|
||||
### Memory vs. Skills: What Goes Where
|
||||
|
||||
**Memory** is for facts: your environment, preferences, project locations, and things the agent has learned about you. **Skills** are for procedures: multi-step workflows, tool-specific instructions, and reusable recipes. Use memory for "what," skills for "how."
|
||||
|
||||
### When to Create Skills
|
||||
|
||||
If you find a task that takes 5+ steps and you'll do it again, ask the agent to create a skill for it. Say "save what you just did as a skill called `deploy-staging`." Next time, just type `/deploy-staging` and the agent loads the full procedure.
|
||||
|
||||
### Managing Memory Capacity
|
||||
|
||||
Memory is intentionally bounded (~2,200 chars for MEMORY.md, ~1,375 chars for USER.md). When it fills up, the agent consolidates entries. You can help by saying "clean up your memory" or "replace the old Python 3.9 note — we're on 3.12 now."
|
||||
|
||||
### Let the Agent Remember
|
||||
|
||||
After a productive session, say "remember this for next time" and the agent will save the key takeaways. You can also be specific: "save to memory that our CI uses GitHub Actions with the `deploy.yml` workflow."
|
||||
|
||||
:::warning
|
||||
Memory is a frozen snapshot — changes made during a session don't appear in the system prompt until the next session starts. The agent writes to disk immediately, but the prompt cache isn't invalidated mid-session.
|
||||
:::
|
||||
|
||||
## Performance & Cost
|
||||
|
||||
### Don't Break the Prompt Cache
|
||||
|
||||
Most LLM providers cache the system prompt prefix. If you keep your system prompt stable (same context files, same memory), subsequent messages in a session get **cache hits** that are significantly cheaper. Avoid changing the model or system prompt mid-session.
|
||||
|
||||
### Use /compress Before Hitting Limits
|
||||
|
||||
Long sessions accumulate tokens. When you notice responses slowing down or getting truncated, run `/compress`. This summarizes the conversation history, preserving key context while dramatically reducing token count. Use `/usage` to check where you stand.
|
||||
|
||||
### Delegate for Parallel Work
|
||||
|
||||
Need to research three topics at once? Ask the agent to use `delegate_task` with parallel subtasks. Each subagent runs independently with its own context, and only the final summaries come back — massively reducing your main conversation's token usage.
|
||||
|
||||
### Use execute_code for Batch Operations
|
||||
|
||||
Instead of running terminal commands one at a time, ask the agent to write a script that does everything at once. "Write a Python script to rename all `.jpeg` files to `.jpg` and run it" is cheaper and faster than renaming files individually.
|
||||
|
||||
### Choose the Right Model
|
||||
|
||||
Use `/model` to switch models mid-session. Use a frontier model (Claude Sonnet/Opus, GPT-4o) for complex reasoning and architecture decisions. Switch to a faster model for simple tasks like formatting, renaming, or boilerplate generation.
|
||||
|
||||
:::tip
|
||||
Run `/usage` periodically to see your token consumption. Run `/insights` for a broader view of usage patterns over the last 30 days.
|
||||
:::
|
||||
|
||||
## Messaging Tips
|
||||
|
||||
### Set a Home Channel
|
||||
|
||||
Use `/sethome` in your preferred Telegram or Discord chat to designate it as the home channel. Cron job results and scheduled task outputs are delivered here. Without it, the agent has nowhere to send proactive messages.
|
||||
|
||||
### Use /title to Organize Sessions
|
||||
|
||||
Name your sessions with `/title auth-refactor` or `/title research-llm-quantization`. Named sessions are easy to find with `hermes sessions list` and resume with `hermes -r "auth-refactor"`. Unnamed sessions pile up and become impossible to distinguish.
|
||||
|
||||
### DM Pairing for Team Access
|
||||
|
||||
Instead of manually collecting user IDs for allowlists, enable DM pairing. When a teammate DMs the bot, they get a one-time pairing code. You approve it with `hermes pairing approve telegram XKGH5N7P` — simple and secure.
|
||||
|
||||
### Tool Progress Display Modes
|
||||
|
||||
Use `/verbose` to control how much tool activity you see. In messaging platforms, less is usually more — keep it on "new" to see just new tool calls. In the CLI, "all" gives you a satisfying live view of everything the agent does.
|
||||
|
||||
:::tip
|
||||
On messaging platforms, sessions auto-reset after idle time (default: 24 hours) or daily at 4 AM. Adjust per-platform in `~/.hermes/config.yaml` if you need longer sessions.
|
||||
:::
|
||||
|
||||
## Security
|
||||
|
||||
### Use Docker for Untrusted Code
|
||||
|
||||
When working with untrusted repositories or running unfamiliar code, use Docker or Daytona as your terminal backend. Set `TERMINAL_BACKEND=docker` in your `.env`. Destructive commands inside a container can't harm your host system.
|
||||
|
||||
```bash
|
||||
# In your .env:
|
||||
TERMINAL_BACKEND=docker
|
||||
TERMINAL_DOCKER_IMAGE=hermes-sandbox:latest
|
||||
```
|
||||
|
||||
### Avoid Windows Encoding Pitfalls
|
||||
|
||||
On Windows, some default encodings (such as `cp125x`) cannot represent all Unicode characters, which can cause `UnicodeEncodeError` when writing files in tests or scripts.
|
||||
|
||||
- Prefer opening files with an explicit UTF-8 encoding:
|
||||
|
||||
```python
|
||||
with open("results.txt", "w", encoding="utf-8") as f:
|
||||
f.write("✓ All good\n")
|
||||
```
|
||||
|
||||
- In PowerShell, you can also switch the current session to UTF-8 for console and native command output:
|
||||
|
||||
```powershell
|
||||
$OutputEncoding = [Console]::OutputEncoding = [Text.UTF8Encoding]::new($false)
|
||||
```
|
||||
|
||||
This keeps PowerShell and child processes on UTF-8 and helps avoid Windows-only failures.
|
||||
|
||||
### Review Before Choosing "Always"
|
||||
|
||||
When the agent triggers a dangerous command approval (`rm -rf`, `DROP TABLE`, etc.), you get four options: **once**, **session**, **always**, **deny**. Think carefully before choosing "always" — it permanently allowlists that pattern. Start with "session" until you're comfortable.
|
||||
|
||||
### Command Approval Is Your Safety Net
|
||||
|
||||
Hermes checks every command against a curated list of dangerous patterns before execution. This includes recursive deletes, SQL drops, piping curl to shell, and more. Don't disable this in production — it exists for good reasons.
|
||||
|
||||
:::warning
|
||||
When running in a container backend (Docker, Singularity, Modal, Daytona), dangerous command checks are **skipped** because the container is the security boundary. Make sure your container images are properly locked down.
|
||||
:::
|
||||
|
||||
### Use Allowlists for Messaging Bots
|
||||
|
||||
Never set `GATEWAY_ALLOW_ALL_USERS=true` on a bot with terminal access. Always use platform-specific allowlists (`TELEGRAM_ALLOWED_USERS`, `DISCORD_ALLOWED_USERS`) or DM pairing to control who can interact with your agent.
|
||||
|
||||
```bash
|
||||
# Recommended: explicit allowlists per platform
|
||||
TELEGRAM_ALLOWED_USERS=123456789,987654321
|
||||
DISCORD_ALLOWED_USERS=123456789012345678
|
||||
|
||||
# Or use cross-platform allowlist
|
||||
GATEWAY_ALLOWED_USERS=123456789,987654321
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
*Have a tip that should be on this page? Open an issue or PR — community contributions are welcome.*
|
||||
415
hermes_code/website/docs/guides/use-mcp-with-hermes.md
Normal file
415
hermes_code/website/docs/guides/use-mcp-with-hermes.md
Normal file
|
|
@ -0,0 +1,415 @@
|
|||
---
|
||||
sidebar_position: 5
|
||||
title: "Use MCP with Hermes"
|
||||
description: "A practical guide to connecting MCP servers to Hermes Agent, filtering their tools, and using them safely in real workflows"
|
||||
---
|
||||
|
||||
# Use MCP with Hermes
|
||||
|
||||
This guide shows how to actually use MCP with Hermes Agent in day-to-day workflows.
|
||||
|
||||
If the feature page explains what MCP is, this guide is about how to get value from it quickly and safely.
|
||||
|
||||
## When should you use MCP?
|
||||
|
||||
Use MCP when:
|
||||
- a tool already exists in MCP form and you do not want to build a native Hermes tool
|
||||
- you want Hermes to operate against a local or remote system through a clean RPC layer
|
||||
- you want fine-grained per-server exposure control
|
||||
- you want to connect Hermes to internal APIs, databases, or company systems without modifying Hermes core
|
||||
|
||||
Do not use MCP when:
|
||||
- a built-in Hermes tool already solves the job well
|
||||
- the server exposes a huge dangerous tool surface and you are not prepared to filter it
|
||||
- you only need one very narrow integration and a native tool would be simpler and safer
|
||||
|
||||
## Mental model
|
||||
|
||||
Think of MCP as an adapter layer:
|
||||
|
||||
- Hermes remains the agent
|
||||
- MCP servers contribute tools
|
||||
- Hermes discovers those tools at startup or reload time
|
||||
- the model can use them like normal tools
|
||||
- you control how much of each server is visible
|
||||
|
||||
That last part matters. Good MCP usage is not just “connect everything.” It is “connect the right thing, with the smallest useful surface.”
|
||||
|
||||
## Step 1: install MCP support
|
||||
|
||||
If you installed Hermes with the standard install script, MCP support is already included (the installer runs `uv pip install -e ".[all]"`).
|
||||
|
||||
If you installed without extras and need to add MCP separately:
|
||||
|
||||
```bash
|
||||
cd ~/.hermes/hermes-agent
|
||||
uv pip install -e ".[mcp]"
|
||||
```
|
||||
|
||||
For npm-based servers, make sure Node.js and `npx` are available.
|
||||
|
||||
For many Python MCP servers, `uvx` is a nice default.
|
||||
|
||||
## Step 2: add one server first
|
||||
|
||||
Start with a single, safe server.
|
||||
|
||||
Example: filesystem access to one project directory only.
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
project_fs:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/my-project"]
|
||||
```
|
||||
|
||||
Then start Hermes:
|
||||
|
||||
```bash
|
||||
hermes chat
|
||||
```
|
||||
|
||||
Now ask something concrete:
|
||||
|
||||
```text
|
||||
Inspect this project and summarize the repo layout.
|
||||
```
|
||||
|
||||
## Step 3: verify MCP loaded
|
||||
|
||||
You can verify MCP in a few ways:
|
||||
|
||||
- Hermes banner/status should show MCP integration when configured
|
||||
- ask Hermes what tools it has available
|
||||
- use `/reload-mcp` after config changes
|
||||
- check logs if the server failed to connect
|
||||
|
||||
A practical test prompt:
|
||||
|
||||
```text
|
||||
Tell me which MCP-backed tools are available right now.
|
||||
```
|
||||
|
||||
## Step 4: start filtering immediately
|
||||
|
||||
Do not wait until later if the server exposes a lot of tools.
|
||||
|
||||
### Example: whitelist only what you want
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
|
||||
tools:
|
||||
include: [list_issues, create_issue, search_code]
|
||||
```
|
||||
|
||||
This is usually the best default for sensitive systems.
|
||||
|
||||
### Example: blacklist dangerous actions
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
stripe:
|
||||
url: "https://mcp.stripe.com"
|
||||
headers:
|
||||
Authorization: "Bearer ***"
|
||||
tools:
|
||||
exclude: [delete_customer, refund_payment]
|
||||
```
|
||||
|
||||
### Example: disable utility wrappers too
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
docs:
|
||||
url: "https://mcp.docs.example.com"
|
||||
tools:
|
||||
prompts: false
|
||||
resources: false
|
||||
```
|
||||
|
||||
## What does filtering actually affect?
|
||||
|
||||
There are two categories of MCP-exposed functionality in Hermes:
|
||||
|
||||
1. Server-native MCP tools
|
||||
- filtered with:
|
||||
- `tools.include`
|
||||
- `tools.exclude`
|
||||
|
||||
2. Hermes-added utility wrappers
|
||||
- filtered with:
|
||||
- `tools.resources`
|
||||
- `tools.prompts`
|
||||
|
||||
### Utility wrappers you may see
|
||||
|
||||
Resources:
|
||||
- `list_resources`
|
||||
- `read_resource`
|
||||
|
||||
Prompts:
|
||||
- `list_prompts`
|
||||
- `get_prompt`
|
||||
|
||||
These wrappers only appear if:
|
||||
- your config allows them, and
|
||||
- the MCP server session actually supports those capabilities
|
||||
|
||||
So Hermes will not pretend a server has resources/prompts if it does not.
|
||||
|
||||
## Common patterns
|
||||
|
||||
### Pattern 1: local project assistant
|
||||
|
||||
Use MCP for a repo-local filesystem or git server when you want Hermes to reason over a bounded workspace.
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
fs:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/project"]
|
||||
|
||||
git:
|
||||
command: "uvx"
|
||||
args: ["mcp-server-git", "--repository", "/home/user/project"]
|
||||
```
|
||||
|
||||
Good prompts:
|
||||
|
||||
```text
|
||||
Review the project structure and identify where configuration lives.
|
||||
```
|
||||
|
||||
```text
|
||||
Check the local git state and summarize what changed recently.
|
||||
```
|
||||
|
||||
### Pattern 2: GitHub triage assistant
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
|
||||
tools:
|
||||
include: [list_issues, create_issue, update_issue, search_code]
|
||||
prompts: false
|
||||
resources: false
|
||||
```
|
||||
|
||||
Good prompts:
|
||||
|
||||
```text
|
||||
List open issues about MCP, cluster them by theme, and draft a high-quality issue for the most common bug.
|
||||
```
|
||||
|
||||
```text
|
||||
Search the repo for uses of _discover_and_register_server and explain how MCP tools are registered.
|
||||
```
|
||||
|
||||
### Pattern 3: internal API assistant
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
internal_api:
|
||||
url: "https://mcp.internal.example.com"
|
||||
headers:
|
||||
Authorization: "Bearer ***"
|
||||
tools:
|
||||
include: [list_customers, get_customer, list_invoices]
|
||||
resources: false
|
||||
prompts: false
|
||||
```
|
||||
|
||||
Good prompts:
|
||||
|
||||
```text
|
||||
Look up customer ACME Corp and summarize recent invoice activity.
|
||||
```
|
||||
|
||||
This is the sort of place where a strict whitelist is far better than an exclude list.
|
||||
|
||||
### Pattern 4: documentation / knowledge servers
|
||||
|
||||
Some MCP servers expose prompts or resources that are more like shared knowledge assets than direct actions.
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
docs:
|
||||
url: "https://mcp.docs.example.com"
|
||||
tools:
|
||||
prompts: true
|
||||
resources: true
|
||||
```
|
||||
|
||||
Good prompts:
|
||||
|
||||
```text
|
||||
List available MCP resources from the docs server, then read the onboarding guide and summarize it.
|
||||
```
|
||||
|
||||
```text
|
||||
List prompts exposed by the docs server and tell me which ones would help with incident response.
|
||||
```
|
||||
|
||||
## Tutorial: end-to-end setup with filtering
|
||||
|
||||
Here is a practical progression.
|
||||
|
||||
### Phase 1: add GitHub MCP with a tight whitelist
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
|
||||
tools:
|
||||
include: [list_issues, create_issue, search_code]
|
||||
prompts: false
|
||||
resources: false
|
||||
```
|
||||
|
||||
Start Hermes and ask:
|
||||
|
||||
```text
|
||||
Search the codebase for references to MCP and summarize the main integration points.
|
||||
```
|
||||
|
||||
### Phase 2: expand only when needed
|
||||
|
||||
If you later need issue updates too:
|
||||
|
||||
```yaml
|
||||
tools:
|
||||
include: [list_issues, create_issue, update_issue, search_code]
|
||||
```
|
||||
|
||||
Then reload:
|
||||
|
||||
```text
|
||||
/reload-mcp
|
||||
```
|
||||
|
||||
### Phase 3: add a second server with different policy
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
|
||||
tools:
|
||||
include: [list_issues, create_issue, update_issue, search_code]
|
||||
prompts: false
|
||||
resources: false
|
||||
|
||||
filesystem:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/project"]
|
||||
```
|
||||
|
||||
Now Hermes can combine them:
|
||||
|
||||
```text
|
||||
Inspect the local project files, then create a GitHub issue summarizing the bug you find.
|
||||
```
|
||||
|
||||
That is where MCP gets powerful: multi-system workflows without changing Hermes core.
|
||||
|
||||
## Safe usage recommendations
|
||||
|
||||
### Prefer allowlists for dangerous systems
|
||||
|
||||
For anything financial, customer-facing, or destructive:
|
||||
- use `tools.include`
|
||||
- start with the smallest set possible
|
||||
|
||||
### Disable unused utilities
|
||||
|
||||
If you do not want the model browsing server-provided resources/prompts, turn them off:
|
||||
|
||||
```yaml
|
||||
tools:
|
||||
resources: false
|
||||
prompts: false
|
||||
```
|
||||
|
||||
### Keep servers scoped narrowly
|
||||
|
||||
Examples:
|
||||
- filesystem server rooted to one project dir, not your whole home directory
|
||||
- git server pointed at one repo
|
||||
- internal API server with read-heavy tool exposure by default
|
||||
|
||||
### Reload after config changes
|
||||
|
||||
```text
|
||||
/reload-mcp
|
||||
```
|
||||
|
||||
Do this after changing:
|
||||
- include/exclude lists
|
||||
- enabled flags
|
||||
- resources/prompts toggles
|
||||
- auth headers / env
|
||||
|
||||
## Troubleshooting by symptom
|
||||
|
||||
### "The server connects but the tools I expected are missing"
|
||||
|
||||
Possible causes:
|
||||
- filtered by `tools.include`
|
||||
- excluded by `tools.exclude`
|
||||
- utility wrappers disabled via `resources: false` or `prompts: false`
|
||||
- server does not actually support resources/prompts
|
||||
|
||||
### "The server is configured but nothing loads"
|
||||
|
||||
Check:
|
||||
- `enabled: false` was not left in config
|
||||
- command/runtime exists (`npx`, `uvx`, etc.)
|
||||
- HTTP endpoint is reachable
|
||||
- auth env or headers are correct
|
||||
|
||||
### "Why do I see fewer tools than the MCP server advertises?"
|
||||
|
||||
Because Hermes now respects your per-server policy and capability-aware registration. That is expected, and usually desirable.
|
||||
|
||||
### "How do I remove an MCP server without deleting the config?"
|
||||
|
||||
Use:
|
||||
|
||||
```yaml
|
||||
enabled: false
|
||||
```
|
||||
|
||||
That keeps the config around but prevents connection and registration.
|
||||
|
||||
## Recommended first MCP setups
|
||||
|
||||
Good first servers for most users:
|
||||
- filesystem
|
||||
- git
|
||||
- GitHub
|
||||
- fetch / documentation MCP servers
|
||||
- one narrow internal API
|
||||
|
||||
Not-great first servers:
|
||||
- giant business systems with lots of destructive actions and no filtering
|
||||
- anything you do not understand well enough to constrain
|
||||
|
||||
## Related docs
|
||||
|
||||
- [MCP (Model Context Protocol)](/docs/user-guide/features/mcp)
|
||||
- [FAQ](/docs/reference/faq)
|
||||
- [Slash Commands](/docs/reference/slash-commands)
|
||||
264
hermes_code/website/docs/guides/use-soul-with-hermes.md
Normal file
264
hermes_code/website/docs/guides/use-soul-with-hermes.md
Normal file
|
|
@ -0,0 +1,264 @@
|
|||
---
|
||||
sidebar_position: 6
|
||||
title: "Use SOUL.md with Hermes"
|
||||
description: "How to use SOUL.md to shape Hermes Agent's default voice, what belongs there, and how it differs from AGENTS.md and /personality"
|
||||
---
|
||||
|
||||
# Use SOUL.md with Hermes
|
||||
|
||||
`SOUL.md` is the **primary identity** for your Hermes instance. It's the first thing in the system prompt — it defines who the agent is, how it speaks, and what it avoids.
|
||||
|
||||
If you want Hermes to feel like the same assistant every time you talk to it — or if you want to replace the Hermes persona entirely with your own — this is the file to use.
|
||||
|
||||
## What SOUL.md is for
|
||||
|
||||
Use `SOUL.md` for:
|
||||
- tone
|
||||
- personality
|
||||
- communication style
|
||||
- how direct or warm Hermes should be
|
||||
- what Hermes should avoid stylistically
|
||||
- how Hermes should relate to uncertainty, disagreement, and ambiguity
|
||||
|
||||
In short:
|
||||
- `SOUL.md` is about who Hermes is and how Hermes speaks
|
||||
|
||||
## What SOUL.md is not for
|
||||
|
||||
Do not use it for:
|
||||
- repo-specific coding conventions
|
||||
- file paths
|
||||
- commands
|
||||
- service ports
|
||||
- architecture notes
|
||||
- project workflow instructions
|
||||
|
||||
Those belong in `AGENTS.md`.
|
||||
|
||||
A good rule:
|
||||
- if it should apply everywhere, put it in `SOUL.md`
|
||||
- if it only belongs to one project, put it in `AGENTS.md`
|
||||
|
||||
## Where it lives
|
||||
|
||||
Hermes now uses only the global SOUL file for the current instance:
|
||||
|
||||
```text
|
||||
~/.hermes/SOUL.md
|
||||
```
|
||||
|
||||
If you run Hermes with a custom home directory, it becomes:
|
||||
|
||||
```text
|
||||
$HERMES_HOME/SOUL.md
|
||||
```
|
||||
|
||||
## First-run behavior
|
||||
|
||||
Hermes automatically seeds a starter `SOUL.md` for you if one does not already exist.
|
||||
|
||||
That means most users now begin with a real file they can read and edit immediately.
|
||||
|
||||
Important:
|
||||
- if you already have a `SOUL.md`, Hermes does not overwrite it
|
||||
- if the file exists but is empty, Hermes adds nothing from it to the prompt
|
||||
|
||||
## How Hermes uses it
|
||||
|
||||
When Hermes starts a session, it reads `SOUL.md` from `HERMES_HOME`, scans it for prompt-injection patterns, truncates it if needed, and uses it as the **agent identity** — slot #1 in the system prompt. This means SOUL.md completely replaces the built-in default identity text.
|
||||
|
||||
If SOUL.md is missing, empty, or cannot be loaded, Hermes falls back to a built-in default identity.
|
||||
|
||||
No wrapper language is added around the file. The content itself matters — write the way you want your agent to think and speak.
|
||||
|
||||
## A good first edit
|
||||
|
||||
If you do nothing else, open the file and change just a few lines so it feels like you.
|
||||
|
||||
For example:
|
||||
|
||||
```markdown
|
||||
You are direct, calm, and technically precise.
|
||||
Prefer substance over politeness theater.
|
||||
Push back clearly when an idea is weak.
|
||||
Keep answers compact unless deeper detail is useful.
|
||||
```
|
||||
|
||||
That alone can noticeably change how Hermes feels.
|
||||
|
||||
## Example styles
|
||||
|
||||
### 1. Pragmatic engineer
|
||||
|
||||
```markdown
|
||||
You are a pragmatic senior engineer.
|
||||
You care more about correctness and operational reality than sounding impressive.
|
||||
|
||||
## Style
|
||||
- Be direct
|
||||
- Be concise unless complexity requires depth
|
||||
- Say when something is a bad idea
|
||||
- Prefer practical tradeoffs over idealized abstractions
|
||||
|
||||
## Avoid
|
||||
- Sycophancy
|
||||
- Hype language
|
||||
- Overexplaining obvious things
|
||||
```
|
||||
|
||||
### 2. Research partner
|
||||
|
||||
```markdown
|
||||
You are a thoughtful research collaborator.
|
||||
You are curious, honest about uncertainty, and excited by unusual ideas.
|
||||
|
||||
## Style
|
||||
- Explore possibilities without pretending certainty
|
||||
- Distinguish speculation from evidence
|
||||
- Ask clarifying questions when the idea space is underspecified
|
||||
- Prefer conceptual depth over shallow completeness
|
||||
```
|
||||
|
||||
### 3. Teacher / explainer
|
||||
|
||||
```markdown
|
||||
You are a patient technical teacher.
|
||||
You care about understanding, not performance.
|
||||
|
||||
## Style
|
||||
- Explain clearly
|
||||
- Use examples when they help
|
||||
- Do not assume prior knowledge unless the user signals it
|
||||
- Build from intuition to details
|
||||
```
|
||||
|
||||
### 4. Tough reviewer
|
||||
|
||||
```markdown
|
||||
You are a rigorous reviewer.
|
||||
You are fair, but you do not soften important criticism.
|
||||
|
||||
## Style
|
||||
- Point out weak assumptions directly
|
||||
- Prioritize correctness over harmony
|
||||
- Be explicit about risks and tradeoffs
|
||||
- Prefer blunt clarity to vague diplomacy
|
||||
```
|
||||
|
||||
## What makes a strong SOUL.md?
|
||||
|
||||
A strong `SOUL.md` is:
|
||||
- stable
|
||||
- broadly applicable
|
||||
- specific in voice
|
||||
- not overloaded with temporary instructions
|
||||
|
||||
A weak `SOUL.md` is:
|
||||
- full of project details
|
||||
- contradictory
|
||||
- trying to micro-manage every response shape
|
||||
- mostly generic filler like "be helpful" and "be clear"
|
||||
|
||||
Hermes already tries to be helpful and clear. `SOUL.md` should add real personality and style, not restate obvious defaults.
|
||||
|
||||
## Suggested structure
|
||||
|
||||
You do not need headings, but they help.
|
||||
|
||||
A simple structure that works well:
|
||||
|
||||
```markdown
|
||||
# Identity
|
||||
Who Hermes is.
|
||||
|
||||
# Style
|
||||
How Hermes should sound.
|
||||
|
||||
# Avoid
|
||||
What Hermes should not do.
|
||||
|
||||
# Defaults
|
||||
How Hermes should behave when ambiguity appears.
|
||||
```
|
||||
|
||||
## SOUL.md vs /personality
|
||||
|
||||
These are complementary.
|
||||
|
||||
Use `SOUL.md` for your durable baseline.
|
||||
Use `/personality` for temporary mode switches.
|
||||
|
||||
Examples:
|
||||
- your default SOUL is pragmatic and direct
|
||||
- then for one session you use `/personality teacher`
|
||||
- later you switch back without changing your base voice file
|
||||
|
||||
## SOUL.md vs AGENTS.md
|
||||
|
||||
This is the most common mistake.
|
||||
|
||||
### Put this in SOUL.md
|
||||
- “Be direct.”
|
||||
- “Avoid hype language.”
|
||||
- “Prefer short answers unless depth helps.”
|
||||
- “Push back when the user is wrong.”
|
||||
|
||||
### Put this in AGENTS.md
|
||||
- “Use pytest, not unittest.”
|
||||
- “Frontend lives in `frontend/`.”
|
||||
- “Never edit migrations directly.”
|
||||
- “The API runs on port 8000.”
|
||||
|
||||
## How to edit it
|
||||
|
||||
```bash
|
||||
nano ~/.hermes/SOUL.md
|
||||
```
|
||||
|
||||
or
|
||||
|
||||
```bash
|
||||
vim ~/.hermes/SOUL.md
|
||||
```
|
||||
|
||||
Then restart Hermes or start a new session.
|
||||
|
||||
## A practical workflow
|
||||
|
||||
1. Start with the seeded default file
|
||||
2. Trim anything that does not feel like the voice you want
|
||||
3. Add 4–8 lines that clearly define tone and defaults
|
||||
4. Talk to Hermes for a while
|
||||
5. Adjust based on what still feels off
|
||||
|
||||
That iterative approach works better than trying to design the perfect personality in one shot.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### I edited SOUL.md but Hermes still sounds the same
|
||||
|
||||
Check:
|
||||
- you edited `~/.hermes/SOUL.md` or `$HERMES_HOME/SOUL.md`
|
||||
- not some repo-local `SOUL.md`
|
||||
- the file is not empty
|
||||
- your session was restarted after the edit
|
||||
- a `/personality` overlay is not dominating the result
|
||||
|
||||
### Hermes is ignoring parts of my SOUL.md
|
||||
|
||||
Possible causes:
|
||||
- higher-priority instructions are overriding it
|
||||
- the file includes conflicting guidance
|
||||
- the file is too long and got truncated
|
||||
- some of the text resembles prompt-injection content and may be blocked or altered by the scanner
|
||||
|
||||
### My SOUL.md became too project-specific
|
||||
|
||||
Move project instructions into `AGENTS.md` and keep `SOUL.md` focused on identity and style.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Personality & SOUL.md](/docs/user-guide/features/personality)
|
||||
- [Context Files](/docs/user-guide/features/context-files)
|
||||
- [Configuration](/docs/user-guide/configuration)
|
||||
- [Tips & Best Practices](/docs/guides/tips)
|
||||
454
hermes_code/website/docs/guides/use-voice-mode-with-hermes.md
Normal file
454
hermes_code/website/docs/guides/use-voice-mode-with-hermes.md
Normal file
|
|
@ -0,0 +1,454 @@
|
|||
---
|
||||
sidebar_position: 7
|
||||
title: "Use Voice Mode with Hermes"
|
||||
description: "A practical guide to setting up and using Hermes voice mode across CLI, Telegram, Discord, and Discord voice channels"
|
||||
---
|
||||
|
||||
# Use Voice Mode with Hermes
|
||||
|
||||
This guide is the practical companion to the [Voice Mode feature reference](/docs/user-guide/features/voice-mode).
|
||||
|
||||
If the feature page explains what voice mode can do, this guide shows how to actually use it well.
|
||||
|
||||
## What voice mode is good for
|
||||
|
||||
Voice mode is especially useful when:
|
||||
- you want a hands-free CLI workflow
|
||||
- you want spoken responses in Telegram or Discord
|
||||
- you want Hermes sitting in a Discord voice channel for live conversation
|
||||
- you want quick idea capture, debugging, or back-and-forth while walking around instead of typing
|
||||
|
||||
## Choose your voice mode setup
|
||||
|
||||
There are really three different voice experiences in Hermes.
|
||||
|
||||
| Mode | Best for | Platform |
|
||||
|---|---|---|
|
||||
| Interactive microphone loop | Personal hands-free use while coding or researching | CLI |
|
||||
| Voice replies in chat | Spoken responses alongside normal messaging | Telegram, Discord |
|
||||
| Live voice channel bot | Group or personal live conversation in a VC | Discord voice channels |
|
||||
|
||||
A good path is:
|
||||
1. get text working first
|
||||
2. enable voice replies second
|
||||
3. move to Discord voice channels last if you want the full experience
|
||||
|
||||
## Step 1: make sure normal Hermes works first
|
||||
|
||||
Before touching voice mode, verify that:
|
||||
- Hermes starts
|
||||
- your provider is configured
|
||||
- the agent can answer text prompts normally
|
||||
|
||||
```bash
|
||||
hermes
|
||||
```
|
||||
|
||||
Ask something simple:
|
||||
|
||||
```text
|
||||
What tools do you have available?
|
||||
```
|
||||
|
||||
If that is not solid yet, fix text mode first.
|
||||
|
||||
## Step 2: install the right extras
|
||||
|
||||
### CLI microphone + playback
|
||||
|
||||
```bash
|
||||
pip install "hermes-agent[voice]"
|
||||
```
|
||||
|
||||
### Messaging platforms
|
||||
|
||||
```bash
|
||||
pip install "hermes-agent[messaging]"
|
||||
```
|
||||
|
||||
### Premium ElevenLabs TTS
|
||||
|
||||
```bash
|
||||
pip install "hermes-agent[tts-premium]"
|
||||
```
|
||||
|
||||
### Local NeuTTS (optional)
|
||||
|
||||
```bash
|
||||
python -m pip install -U neutts[all]
|
||||
```
|
||||
|
||||
### Everything
|
||||
|
||||
```bash
|
||||
pip install "hermes-agent[all]"
|
||||
```
|
||||
|
||||
## Step 3: install system dependencies
|
||||
|
||||
### macOS
|
||||
|
||||
```bash
|
||||
brew install portaudio ffmpeg opus
|
||||
brew install espeak-ng
|
||||
```
|
||||
|
||||
### Ubuntu / Debian
|
||||
|
||||
```bash
|
||||
sudo apt install portaudio19-dev ffmpeg libopus0
|
||||
sudo apt install espeak-ng
|
||||
```
|
||||
|
||||
Why these matter:
|
||||
- `portaudio` → microphone input / playback for CLI voice mode
|
||||
- `ffmpeg` → audio conversion for TTS and messaging delivery
|
||||
- `opus` → Discord voice codec support
|
||||
- `espeak-ng` → phonemizer backend for NeuTTS
|
||||
|
||||
## Step 4: choose STT and TTS providers
|
||||
|
||||
Hermes supports both local and cloud speech stacks.
|
||||
|
||||
### Easiest / cheapest setup
|
||||
|
||||
Use local STT and free Edge TTS:
|
||||
- STT provider: `local`
|
||||
- TTS provider: `edge`
|
||||
|
||||
This is usually the best place to start.
|
||||
|
||||
### Environment file example
|
||||
|
||||
Add to `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
# Cloud STT options (local needs no key)
|
||||
GROQ_API_KEY=***
|
||||
VOICE_TOOLS_OPENAI_KEY=***
|
||||
|
||||
# Premium TTS (optional)
|
||||
ELEVENLABS_API_KEY=***
|
||||
```
|
||||
|
||||
### Provider recommendations
|
||||
|
||||
#### Speech-to-text
|
||||
|
||||
- `local` → best default for privacy and zero-cost use
|
||||
- `groq` → very fast cloud transcription
|
||||
- `openai` → good paid fallback
|
||||
|
||||
#### Text-to-speech
|
||||
|
||||
- `edge` → free and good enough for most users
|
||||
- `neutts` → free local/on-device TTS
|
||||
- `elevenlabs` → best quality
|
||||
- `openai` → good middle ground
|
||||
|
||||
### If you use `hermes setup`
|
||||
|
||||
If you choose NeuTTS in the setup wizard, Hermes checks whether `neutts` is already installed. If it is missing, the wizard tells you NeuTTS needs the Python package `neutts` and the system package `espeak-ng`, offers to install them for you, installs `espeak-ng` with your platform package manager, and then runs:
|
||||
|
||||
```bash
|
||||
python -m pip install -U neutts[all]
|
||||
```
|
||||
|
||||
If you skip that install or it fails, the wizard falls back to Edge TTS.
|
||||
|
||||
## Step 5: recommended config
|
||||
|
||||
```yaml
|
||||
voice:
|
||||
record_key: "ctrl+b"
|
||||
max_recording_seconds: 120
|
||||
auto_tts: false
|
||||
silence_threshold: 200
|
||||
silence_duration: 3.0
|
||||
|
||||
stt:
|
||||
provider: "local"
|
||||
local:
|
||||
model: "base"
|
||||
|
||||
tts:
|
||||
provider: "edge"
|
||||
edge:
|
||||
voice: "en-US-AriaNeural"
|
||||
```
|
||||
|
||||
This is a good conservative default for most people.
|
||||
|
||||
If you want local TTS instead, switch the `tts` block to:
|
||||
|
||||
```yaml
|
||||
tts:
|
||||
provider: "neutts"
|
||||
neutts:
|
||||
ref_audio: ''
|
||||
ref_text: ''
|
||||
model: neuphonic/neutts-air-q4-gguf
|
||||
device: cpu
|
||||
```
|
||||
|
||||
## Use case 1: CLI voice mode
|
||||
|
||||
## Turn it on
|
||||
|
||||
Start Hermes:
|
||||
|
||||
```bash
|
||||
hermes
|
||||
```
|
||||
|
||||
Inside the CLI:
|
||||
|
||||
```text
|
||||
/voice on
|
||||
```
|
||||
|
||||
### Recording flow
|
||||
|
||||
Default key:
|
||||
- `Ctrl+B`
|
||||
|
||||
Workflow:
|
||||
1. press `Ctrl+B`
|
||||
2. speak
|
||||
3. wait for silence detection to stop recording automatically
|
||||
4. Hermes transcribes and responds
|
||||
5. if TTS is on, it speaks the answer
|
||||
6. the loop can automatically restart for continuous use
|
||||
|
||||
### Useful commands
|
||||
|
||||
```text
|
||||
/voice
|
||||
/voice on
|
||||
/voice off
|
||||
/voice tts
|
||||
/voice status
|
||||
```
|
||||
|
||||
### Good CLI workflows
|
||||
|
||||
#### Walk-up debugging
|
||||
|
||||
Say:
|
||||
|
||||
```text
|
||||
I keep getting a docker permission error. Help me debug it.
|
||||
```
|
||||
|
||||
Then continue hands-free:
|
||||
- "Read the last error again"
|
||||
- "Explain the root cause in simpler terms"
|
||||
- "Now give me the exact fix"
|
||||
|
||||
#### Research / brainstorming
|
||||
|
||||
Great for:
|
||||
- walking around while thinking
|
||||
- dictating half-formed ideas
|
||||
- asking Hermes to structure your thoughts in real time
|
||||
|
||||
#### Accessibility / low-typing sessions
|
||||
|
||||
If typing is inconvenient, voice mode is one of the fastest ways to stay in the full Hermes loop.
|
||||
|
||||
## Tuning CLI behavior
|
||||
|
||||
### Silence threshold
|
||||
|
||||
If Hermes starts/stops too aggressively, tune:
|
||||
|
||||
```yaml
|
||||
voice:
|
||||
silence_threshold: 250
|
||||
```
|
||||
|
||||
Higher threshold = less sensitive.
|
||||
|
||||
### Silence duration
|
||||
|
||||
If you pause a lot between sentences, increase:
|
||||
|
||||
```yaml
|
||||
voice:
|
||||
silence_duration: 4.0
|
||||
```
|
||||
|
||||
### Record key
|
||||
|
||||
If `Ctrl+B` conflicts with your terminal or tmux habits:
|
||||
|
||||
```yaml
|
||||
voice:
|
||||
record_key: "ctrl+space"
|
||||
```
|
||||
|
||||
## Use case 2: voice replies in Telegram or Discord
|
||||
|
||||
This mode is simpler than full voice channels.
|
||||
|
||||
Hermes stays a normal chat bot, but can speak replies.
|
||||
|
||||
### Start the gateway
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
### Turn on voice replies
|
||||
|
||||
Inside Telegram or Discord:
|
||||
|
||||
```text
|
||||
/voice on
|
||||
```
|
||||
|
||||
or
|
||||
|
||||
```text
|
||||
/voice tts
|
||||
```
|
||||
|
||||
### Modes
|
||||
|
||||
| Mode | Meaning |
|
||||
|---|---|
|
||||
| `off` | text only |
|
||||
| `voice_only` | speak only when the user sent voice |
|
||||
| `all` | speak every reply |
|
||||
|
||||
### When to use which mode
|
||||
|
||||
- `/voice on` if you want spoken replies only for voice-originating messages
|
||||
- `/voice tts` if you want a full spoken assistant all the time
|
||||
|
||||
### Good messaging workflows
|
||||
|
||||
#### Telegram assistant on your phone
|
||||
|
||||
Use when:
|
||||
- you are away from your machine
|
||||
- you want to send voice notes and get quick spoken replies
|
||||
- you want Hermes to function like a portable research or ops assistant
|
||||
|
||||
#### Discord DMs with spoken output
|
||||
|
||||
Useful when you want private interaction without server-channel mention behavior.
|
||||
|
||||
## Use case 3: Discord voice channels
|
||||
|
||||
This is the most advanced mode.
|
||||
|
||||
Hermes joins a Discord VC, listens to user speech, transcribes it, runs the normal agent pipeline, and speaks replies back into the channel.
|
||||
|
||||
## Required Discord permissions
|
||||
|
||||
In addition to the normal text-bot setup, make sure the bot has:
|
||||
- Connect
|
||||
- Speak
|
||||
- preferably Use Voice Activity
|
||||
|
||||
Also enable privileged intents in the Developer Portal:
|
||||
- Presence Intent
|
||||
- Server Members Intent
|
||||
- Message Content Intent
|
||||
|
||||
## Join and leave
|
||||
|
||||
In a Discord text channel where the bot is present:
|
||||
|
||||
```text
|
||||
/voice join
|
||||
/voice leave
|
||||
/voice status
|
||||
```
|
||||
|
||||
### What happens when joined
|
||||
|
||||
- users speak in the VC
|
||||
- Hermes detects speech boundaries
|
||||
- transcripts are posted in the associated text channel
|
||||
- Hermes responds in text and audio
|
||||
- the text channel is the one where `/voice join` was issued
|
||||
|
||||
### Best practices for Discord VC use
|
||||
|
||||
- keep `DISCORD_ALLOWED_USERS` tight
|
||||
- use a dedicated bot/testing channel at first
|
||||
- verify STT and TTS work in ordinary text-chat voice mode before trying VC mode
|
||||
|
||||
## Voice quality recommendations
|
||||
|
||||
### Best quality setup
|
||||
|
||||
- STT: local `large-v3` or Groq `whisper-large-v3`
|
||||
- TTS: ElevenLabs
|
||||
|
||||
### Best speed / convenience setup
|
||||
|
||||
- STT: local `base` or Groq
|
||||
- TTS: Edge
|
||||
|
||||
### Best zero-cost setup
|
||||
|
||||
- STT: local
|
||||
- TTS: Edge
|
||||
|
||||
## Common failure modes
|
||||
|
||||
### "No audio device found"
|
||||
|
||||
Install `portaudio`.
|
||||
|
||||
### "Bot joins but hears nothing"
|
||||
|
||||
Check:
|
||||
- your Discord user ID is in `DISCORD_ALLOWED_USERS`
|
||||
- you are not muted
|
||||
- privileged intents are enabled
|
||||
- the bot has Connect/Speak permissions
|
||||
|
||||
### "It transcribes but does not speak"
|
||||
|
||||
Check:
|
||||
- TTS provider config
|
||||
- API key / quota for ElevenLabs or OpenAI
|
||||
- `ffmpeg` install for Edge conversion paths
|
||||
|
||||
### "Whisper outputs garbage"
|
||||
|
||||
Try:
|
||||
- quieter environment
|
||||
- higher `silence_threshold`
|
||||
- different STT provider/model
|
||||
- shorter, clearer utterances
|
||||
|
||||
### "It works in DMs but not in server channels"
|
||||
|
||||
That is often mention policy.
|
||||
|
||||
By default, the bot needs an `@mention` in Discord server text channels unless configured otherwise.
|
||||
|
||||
## Suggested first-week setup
|
||||
|
||||
If you want the shortest path to success:
|
||||
|
||||
1. get text Hermes working
|
||||
2. install `hermes-agent[voice]`
|
||||
3. use CLI voice mode with local STT + Edge TTS
|
||||
4. then enable `/voice on` in Telegram or Discord
|
||||
5. only after that, try Discord VC mode
|
||||
|
||||
That progression keeps the debugging surface small.
|
||||
|
||||
## Where to read next
|
||||
|
||||
- [Voice Mode feature reference](/docs/user-guide/features/voice-mode)
|
||||
- [Messaging Gateway](/docs/user-guide/messaging)
|
||||
- [Discord setup](/docs/user-guide/messaging/discord)
|
||||
- [Telegram setup](/docs/user-guide/messaging/telegram)
|
||||
- [Configuration](/docs/user-guide/configuration)
|
||||
56
hermes_code/website/docs/index.md
Normal file
56
hermes_code/website/docs/index.md
Normal file
|
|
@ -0,0 +1,56 @@
|
|||
---
|
||||
slug: /
|
||||
sidebar_position: 0
|
||||
title: "Hermes Agent Documentation"
|
||||
description: "The self-improving AI agent built by Nous Research. A built-in learning loop that creates skills from experience, improves them during use, and remembers across sessions."
|
||||
hide_table_of_contents: true
|
||||
---
|
||||
|
||||
# Hermes Agent
|
||||
|
||||
The self-improving AI agent built by [Nous Research](https://nousresearch.com). The only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, and builds a deepening model of who you are across sessions.
|
||||
|
||||
<div style={{display: 'flex', gap: '1rem', marginBottom: '2rem', flexWrap: 'wrap'}}>
|
||||
<a href="/docs/getting-started/installation" style={{display: 'inline-block', padding: '0.6rem 1.2rem', backgroundColor: '#FFD700', color: '#07070d', borderRadius: '8px', fontWeight: 600, textDecoration: 'none'}}>Get Started →</a>
|
||||
<a href="https://github.com/NousResearch/hermes-agent" style={{display: 'inline-block', padding: '0.6rem 1.2rem', border: '1px solid rgba(255,215,0,0.2)', borderRadius: '8px', textDecoration: 'none'}}>View on GitHub</a>
|
||||
</div>
|
||||
|
||||
## What is Hermes Agent?
|
||||
|
||||
It's not a coding copilot tethered to an IDE or a chatbot wrapper around a single API. It's an **autonomous agent** that gets more capable the longer it runs. It lives wherever you put it — a $5 VPS, a GPU cluster, or serverless infrastructure (Daytona, Modal) that costs nearly nothing when idle. Talk to it from Telegram while it works on a cloud VM you never SSH into yourself. It's not tied to your laptop.
|
||||
|
||||
## Quick Links
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| 🚀 **[Installation](/docs/getting-started/installation)** | Install in 60 seconds on Linux, macOS, or WSL2 |
|
||||
| 📖 **[Quickstart Tutorial](/docs/getting-started/quickstart)** | Your first conversation and key features to try |
|
||||
| 🗺️ **[Learning Path](/docs/getting-started/learning-path)** | Find the right docs for your experience level |
|
||||
| ⚙️ **[Configuration](/docs/user-guide/configuration)** | Config file, providers, models, and options |
|
||||
| 💬 **[Messaging Gateway](/docs/user-guide/messaging)** | Set up Telegram, Discord, Slack, or WhatsApp |
|
||||
| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 40+ built-in tools and how to configure them |
|
||||
| 🧠 **[Memory System](/docs/user-guide/features/memory)** | Persistent memory that grows across sessions |
|
||||
| 📚 **[Skills System](/docs/user-guide/features/skills)** | Procedural memory the agent creates and reuses |
|
||||
| 🔌 **[MCP Integration](/docs/user-guide/features/mcp)** | Connect to MCP servers, filter their tools, and extend Hermes safely |
|
||||
| 🧭 **[Use MCP with Hermes](/docs/guides/use-mcp-with-hermes)** | Practical MCP setup patterns, examples, and tutorials |
|
||||
| 🎙️ **[Voice Mode](/docs/user-guide/features/voice-mode)** | Real-time voice interaction in CLI, Telegram, Discord, and Discord VC |
|
||||
| 🗣️ **[Use Voice Mode with Hermes](/docs/guides/use-voice-mode-with-hermes)** | Hands-on setup and usage patterns for Hermes voice workflows |
|
||||
| 🎭 **[Personality & SOUL.md](/docs/user-guide/features/personality)** | Define Hermes' default voice with a global SOUL.md |
|
||||
| 📄 **[Context Files](/docs/user-guide/features/context-files)** | Project context files that shape every conversation |
|
||||
| 🔒 **[Security](/docs/user-guide/security)** | Command approval, authorization, container isolation |
|
||||
| 💡 **[Tips & Best Practices](/docs/guides/tips)** | Quick wins to get the most out of Hermes |
|
||||
| 🏗️ **[Architecture](/docs/developer-guide/architecture)** | How it works under the hood |
|
||||
| ❓ **[FAQ & Troubleshooting](/docs/reference/faq)** | Common questions and solutions |
|
||||
|
||||
## Key Features
|
||||
|
||||
- **A closed learning loop** — Agent-curated memory with periodic nudges, autonomous skill creation, skill self-improvement during use, FTS5 cross-session recall with LLM summarization, and [Honcho](https://github.com/plastic-labs/honcho) dialectic user modeling
|
||||
- **Runs anywhere, not just your laptop** — 6 terminal backends: local, Docker, SSH, Daytona, Singularity, Modal. Daytona and Modal offer serverless persistence — your environment hibernates when idle, costing nearly nothing
|
||||
- **Lives where you do** — CLI, Telegram, Discord, Slack, WhatsApp, all from one gateway
|
||||
- **Built by model trainers** — Created by [Nous Research](https://nousresearch.com), the lab behind Hermes, Nomos, and Psyche. Works with [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai), OpenAI, or any endpoint
|
||||
- **Scheduled automations** — Built-in cron with delivery to any platform
|
||||
- **Delegates & parallelizes** — Spawn isolated subagents for parallel workstreams. Programmatic Tool Calling via `execute_code` collapses multi-step pipelines into single inference calls
|
||||
- **Open standard skills** — Compatible with [agentskills.io](https://agentskills.io). Skills are portable, shareable, and community-contributed via the Skills Hub
|
||||
- **Full web control** — Search, extract, browse, vision, image generation, TTS
|
||||
- **MCP support** — Connect to any MCP server for extended tool capabilities
|
||||
- **Research-ready** — Batch processing, trajectory export, RL training with Atropos. Built by [Nous Research](https://nousresearch.com) — the lab behind Hermes, Nomos, and Psyche models
|
||||
8
hermes_code/website/docs/reference/_category_.json
Normal file
8
hermes_code/website/docs/reference/_category_.json
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
{
|
||||
"label": "Reference",
|
||||
"position": 4,
|
||||
"link": {
|
||||
"type": "generated-index",
|
||||
"description": "Complete reference for CLI commands, environment variables, and configuration."
|
||||
}
|
||||
}
|
||||
445
hermes_code/website/docs/reference/cli-commands.md
Normal file
445
hermes_code/website/docs/reference/cli-commands.md
Normal file
|
|
@ -0,0 +1,445 @@
|
|||
---
|
||||
sidebar_position: 1
|
||||
title: "CLI Commands Reference"
|
||||
description: "Authoritative reference for Hermes terminal commands and command families"
|
||||
---
|
||||
|
||||
# CLI Commands Reference
|
||||
|
||||
This page covers the **terminal commands** you run from your shell.
|
||||
|
||||
For in-chat slash commands, see [Slash Commands Reference](./slash-commands.md).
|
||||
|
||||
## Global entrypoint
|
||||
|
||||
```bash
|
||||
hermes [global-options] <command> [subcommand/options]
|
||||
```
|
||||
|
||||
### Global options
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `--version`, `-V` | Show version and exit. |
|
||||
| `--resume <session>`, `-r <session>` | Resume a previous session by ID or title. |
|
||||
| `--continue [name]`, `-c [name]` | Resume the most recent session, or the most recent session matching a title. |
|
||||
| `--worktree`, `-w` | Start in an isolated git worktree for parallel-agent workflows. |
|
||||
| `--yolo` | Bypass dangerous-command approval prompts. |
|
||||
| `--pass-session-id` | Include the session ID in the agent's system prompt. |
|
||||
|
||||
## Top-level commands
|
||||
|
||||
| Command | Purpose |
|
||||
|---------|---------|
|
||||
| `hermes chat` | Interactive or one-shot chat with the agent. |
|
||||
| `hermes model` | Interactively choose the default provider and model. |
|
||||
| `hermes gateway` | Run or manage the messaging gateway service. |
|
||||
| `hermes setup` | Interactive setup wizard for all or part of the configuration. |
|
||||
| `hermes whatsapp` | Configure and pair the WhatsApp bridge. |
|
||||
| `hermes login` / `logout` | Authenticate with OAuth-backed providers. |
|
||||
| `hermes status` | Show agent, auth, and platform status. |
|
||||
| `hermes cron` | Inspect and tick the cron scheduler. |
|
||||
| `hermes doctor` | Diagnose config and dependency issues. |
|
||||
| `hermes config` | Show, edit, migrate, and query configuration files. |
|
||||
| `hermes pairing` | Approve or revoke messaging pairing codes. |
|
||||
| `hermes skills` | Browse, install, publish, audit, and configure skills. |
|
||||
| `hermes honcho` | Manage Honcho cross-session memory integration. |
|
||||
| `hermes acp` | Run Hermes as an ACP server for editor integration. |
|
||||
| `hermes tools` | Configure enabled tools per platform. |
|
||||
| `hermes sessions` | Browse, export, prune, rename, and delete sessions. |
|
||||
| `hermes insights` | Show token/cost/activity analytics. |
|
||||
| `hermes claw` | OpenClaw migration helpers. |
|
||||
| `hermes version` | Show version information. |
|
||||
| `hermes update` | Pull latest code and reinstall dependencies. |
|
||||
| `hermes uninstall` | Remove Hermes from the system. |
|
||||
|
||||
## `hermes chat`
|
||||
|
||||
```bash
|
||||
hermes chat [options]
|
||||
```
|
||||
|
||||
Common options:
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `-q`, `--query "..."` | One-shot, non-interactive prompt. |
|
||||
| `-m`, `--model <model>` | Override the model for this run. |
|
||||
| `-t`, `--toolsets <csv>` | Enable a comma-separated set of toolsets. |
|
||||
| `--provider <provider>` | Force a provider: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `kilocode`. |
|
||||
| `-s`, `--skills <name>` | Preload one or more skills for the session (can be repeated or comma-separated). |
|
||||
| `-v`, `--verbose` | Verbose output. |
|
||||
| `-Q`, `--quiet` | Programmatic mode: suppress banner/spinner/tool previews. |
|
||||
| `--resume <session>` / `--continue [name]` | Resume a session directly from `chat`. |
|
||||
| `--worktree` | Create an isolated git worktree for this run. |
|
||||
| `--checkpoints` | Enable filesystem checkpoints before destructive file changes. |
|
||||
| `--yolo` | Skip approval prompts. |
|
||||
| `--pass-session-id` | Pass the session ID into the system prompt. |
|
||||
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
hermes
|
||||
hermes chat -q "Summarize the latest PRs"
|
||||
hermes chat --provider openrouter --model anthropic/claude-sonnet-4.6
|
||||
hermes chat --toolsets web,terminal,skills
|
||||
hermes chat --quiet -q "Return only JSON"
|
||||
hermes chat --worktree -q "Review this repo and open a PR"
|
||||
```
|
||||
|
||||
## `hermes model`
|
||||
|
||||
Interactive provider + model selector.
|
||||
|
||||
```bash
|
||||
hermes model
|
||||
```
|
||||
|
||||
Use this when you want to:
|
||||
- switch default providers
|
||||
- log into OAuth-backed providers during model selection
|
||||
- pick from provider-specific model lists
|
||||
- configure a custom/self-hosted endpoint
|
||||
- save the new default into config
|
||||
|
||||
### `/model` slash command (mid-session)
|
||||
|
||||
Switch models without leaving a session:
|
||||
|
||||
```
|
||||
/model # Show current model and available options
|
||||
/model claude-sonnet-4 # Switch model (auto-detects provider)
|
||||
/model zai:glm-5 # Switch provider and model
|
||||
/model custom:qwen-2.5 # Use model on your custom endpoint
|
||||
/model custom # Auto-detect model from custom endpoint
|
||||
/model custom:local:qwen-2.5 # Use a named custom provider
|
||||
/model openrouter:anthropic/claude-sonnet-4 # Switch back to cloud
|
||||
```
|
||||
|
||||
Provider and base URL changes are persisted to `config.yaml` automatically. When switching away from a custom endpoint, the stale base URL is cleared to prevent it leaking into other providers.
|
||||
|
||||
## `hermes gateway`
|
||||
|
||||
```bash
|
||||
hermes gateway <subcommand>
|
||||
```
|
||||
|
||||
Subcommands:
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `run` | Run the gateway in the foreground. |
|
||||
| `start` | Start the installed gateway service. |
|
||||
| `stop` | Stop the service. |
|
||||
| `restart` | Restart the service. |
|
||||
| `status` | Show service status. |
|
||||
| `install` | Install as a user service (`systemd` on Linux, `launchd` on macOS). |
|
||||
| `uninstall` | Remove the installed service. |
|
||||
| `setup` | Interactive messaging-platform setup. |
|
||||
|
||||
## `hermes setup`
|
||||
|
||||
```bash
|
||||
hermes setup [model|terminal|gateway|tools|agent] [--non-interactive] [--reset]
|
||||
```
|
||||
|
||||
Use the full wizard or jump into one section:
|
||||
|
||||
| Section | Description |
|
||||
|---------|-------------|
|
||||
| `model` | Provider and model setup. |
|
||||
| `terminal` | Terminal backend and sandbox setup. |
|
||||
| `gateway` | Messaging platform setup. |
|
||||
| `tools` | Enable/disable tools per platform. |
|
||||
| `agent` | Agent behavior settings. |
|
||||
|
||||
Options:
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `--non-interactive` | Use defaults / environment values without prompts. |
|
||||
| `--reset` | Reset configuration to defaults before setup. |
|
||||
|
||||
## `hermes whatsapp`
|
||||
|
||||
```bash
|
||||
hermes whatsapp
|
||||
```
|
||||
|
||||
Runs the WhatsApp pairing/setup flow, including mode selection and QR-code pairing.
|
||||
|
||||
## `hermes login` / `hermes logout`
|
||||
|
||||
```bash
|
||||
hermes login [--provider nous|openai-codex] [--portal-url ...] [--inference-url ...]
|
||||
hermes logout [--provider nous|openai-codex]
|
||||
```
|
||||
|
||||
`login` supports:
|
||||
- Nous Portal OAuth/device flow
|
||||
- OpenAI Codex OAuth/device flow
|
||||
|
||||
Useful options for `login`:
|
||||
- `--no-browser`
|
||||
- `--timeout <seconds>`
|
||||
- `--ca-bundle <pem>`
|
||||
- `--insecure`
|
||||
|
||||
## `hermes status`
|
||||
|
||||
```bash
|
||||
hermes status [--all] [--deep]
|
||||
```
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `--all` | Show all details in a shareable redacted format. |
|
||||
| `--deep` | Run deeper checks that may take longer. |
|
||||
|
||||
## `hermes cron`
|
||||
|
||||
```bash
|
||||
hermes cron <list|create|edit|pause|resume|run|remove|status|tick>
|
||||
```
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `list` | Show scheduled jobs. |
|
||||
| `create` / `add` | Create a scheduled job from a prompt, optionally attaching one or more skills via repeated `--skill`. |
|
||||
| `edit` | Update a job's schedule, prompt, name, delivery, repeat count, or attached skills. Supports `--clear-skills`, `--add-skill`, and `--remove-skill`. |
|
||||
| `pause` | Pause a job without deleting it. |
|
||||
| `resume` | Resume a paused job and compute its next future run. |
|
||||
| `run` | Trigger a job on the next scheduler tick. |
|
||||
| `remove` | Delete a scheduled job. |
|
||||
| `status` | Check whether the cron scheduler is running. |
|
||||
| `tick` | Run due jobs once and exit. |
|
||||
|
||||
## `hermes doctor`
|
||||
|
||||
```bash
|
||||
hermes doctor [--fix]
|
||||
```
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `--fix` | Attempt automatic repairs where possible. |
|
||||
|
||||
## `hermes config`
|
||||
|
||||
```bash
|
||||
hermes config <subcommand>
|
||||
```
|
||||
|
||||
Subcommands:
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `show` | Show current config values. |
|
||||
| `edit` | Open `config.yaml` in your editor. |
|
||||
| `set <key> <value>` | Set a config value. |
|
||||
| `path` | Print the config file path. |
|
||||
| `env-path` | Print the `.env` file path. |
|
||||
| `check` | Check for missing or stale config. |
|
||||
| `migrate` | Add newly introduced options interactively. |
|
||||
|
||||
## `hermes pairing`
|
||||
|
||||
```bash
|
||||
hermes pairing <list|approve|revoke|clear-pending>
|
||||
```
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `list` | Show pending and approved users. |
|
||||
| `approve <platform> <code>` | Approve a pairing code. |
|
||||
| `revoke <platform> <user-id>` | Revoke a user's access. |
|
||||
| `clear-pending` | Clear pending pairing codes. |
|
||||
|
||||
## `hermes skills`
|
||||
|
||||
```bash
|
||||
hermes skills <subcommand>
|
||||
```
|
||||
|
||||
Subcommands:
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `browse` | Paginated browser for skill registries. |
|
||||
| `search` | Search skill registries. |
|
||||
| `install` | Install a skill. |
|
||||
| `inspect` | Preview a skill without installing it. |
|
||||
| `list` | List installed skills. |
|
||||
| `check` | Check installed hub skills for upstream updates. |
|
||||
| `update` | Reinstall hub skills with upstream changes when available. |
|
||||
| `audit` | Re-scan installed hub skills. |
|
||||
| `uninstall` | Remove a hub-installed skill. |
|
||||
| `publish` | Publish a skill to a registry. |
|
||||
| `snapshot` | Export/import skill configurations. |
|
||||
| `tap` | Manage custom skill sources. |
|
||||
| `config` | Interactive enable/disable configuration for skills by platform. |
|
||||
|
||||
Common examples:
|
||||
|
||||
```bash
|
||||
hermes skills browse
|
||||
hermes skills browse --source official
|
||||
hermes skills search react --source skills-sh
|
||||
hermes skills search https://mintlify.com/docs --source well-known
|
||||
hermes skills inspect official/security/1password
|
||||
hermes skills inspect skills-sh/vercel-labs/json-render/json-render-react
|
||||
hermes skills install official/migration/openclaw-migration
|
||||
hermes skills install skills-sh/anthropics/skills/pdf --force
|
||||
hermes skills check
|
||||
hermes skills update
|
||||
hermes skills config
|
||||
```
|
||||
|
||||
Notes:
|
||||
- `--force` can override non-dangerous policy blocks for third-party/community skills.
|
||||
- `--force` does not override a `dangerous` scan verdict.
|
||||
- `--source skills-sh` searches the public `skills.sh` directory.
|
||||
- `--source well-known` lets you point Hermes at a site exposing `/.well-known/skills/index.json`.
|
||||
|
||||
## `hermes honcho`
|
||||
|
||||
```bash
|
||||
hermes honcho <subcommand>
|
||||
```
|
||||
|
||||
Subcommands:
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `setup` | Interactive Honcho setup wizard. |
|
||||
| `status` | Show current Honcho config and connection status. |
|
||||
| `sessions` | List known Honcho session mappings. |
|
||||
| `map` | Map the current directory to a Honcho session name. |
|
||||
| `peer` | Show or update peer names and dialectic reasoning level. |
|
||||
| `mode` | Show or set memory mode: `hybrid`, `honcho`, or `local`. |
|
||||
| `tokens` | Show or set token budgets for context and dialectic. |
|
||||
| `identity` | Seed or show the AI peer identity representation. |
|
||||
| `migrate` | Migration guide from openclaw-honcho to Hermes Honcho. |
|
||||
|
||||
## `hermes acp`
|
||||
|
||||
```bash
|
||||
hermes acp
|
||||
```
|
||||
|
||||
Starts Hermes as an ACP (Agent Client Protocol) stdio server for editor integration.
|
||||
|
||||
Related entrypoints:
|
||||
|
||||
```bash
|
||||
hermes-acp
|
||||
python -m acp_adapter
|
||||
```
|
||||
|
||||
Install support first:
|
||||
|
||||
```bash
|
||||
pip install -e '.[acp]'
|
||||
```
|
||||
|
||||
See [ACP Editor Integration](../user-guide/features/acp.md) and [ACP Internals](../developer-guide/acp-internals.md).
|
||||
|
||||
## `hermes mcp`
|
||||
|
||||
```bash
|
||||
hermes mcp <subcommand>
|
||||
```
|
||||
|
||||
Manage MCP (Model Context Protocol) server configurations.
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `add <name> [--url URL] [--command CMD] [--args ...] [--auth oauth\|header]` | Add an MCP server with automatic tool discovery. |
|
||||
| `remove <name>` (alias: `rm`) | Remove an MCP server from config. |
|
||||
| `list` (alias: `ls`) | List configured MCP servers. |
|
||||
| `test <name>` | Test connection to an MCP server. |
|
||||
| `configure <name>` (alias: `config`) | Toggle tool selection for a server. |
|
||||
|
||||
See [MCP Config Reference](./mcp-config-reference.md) and [Use MCP with Hermes](../guides/use-mcp-with-hermes.md).
|
||||
|
||||
## `hermes plugins`
|
||||
|
||||
```bash
|
||||
hermes plugins <subcommand>
|
||||
```
|
||||
|
||||
Manage Hermes Agent plugins.
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `install <identifier> [--force]` | Install a plugin from a Git URL or `owner/repo`. |
|
||||
| `update <name>` | Pull latest changes for an installed plugin. |
|
||||
| `remove <name>` (aliases: `rm`, `uninstall`) | Remove an installed plugin. |
|
||||
| `list` (alias: `ls`) | List installed plugins. |
|
||||
|
||||
See [Plugins](../user-guide/features/plugins.md) and [Build a Hermes Plugin](../guides/build-a-hermes-plugin.md).
|
||||
|
||||
## `hermes tools`
|
||||
|
||||
```bash
|
||||
hermes tools [--summary]
|
||||
```
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `--summary` | Print the current enabled-tools summary and exit. |
|
||||
|
||||
Without `--summary`, this launches the interactive per-platform tool configuration UI.
|
||||
|
||||
## `hermes sessions`
|
||||
|
||||
```bash
|
||||
hermes sessions <subcommand>
|
||||
```
|
||||
|
||||
Subcommands:
|
||||
|
||||
| Subcommand | Description |
|
||||
|------------|-------------|
|
||||
| `list` | List recent sessions. |
|
||||
| `browse` | Interactive session picker with search and resume. |
|
||||
| `export <output> [--session-id ID]` | Export sessions to JSONL. |
|
||||
| `delete <session-id>` | Delete one session. |
|
||||
| `prune` | Delete old sessions. |
|
||||
| `stats` | Show session-store statistics. |
|
||||
| `rename <session-id> <title>` | Set or change a session title. |
|
||||
|
||||
## `hermes insights`
|
||||
|
||||
```bash
|
||||
hermes insights [--days N] [--source platform]
|
||||
```
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `--days <n>` | Analyze the last `n` days (default: 30). |
|
||||
| `--source <platform>` | Filter by source such as `cli`, `telegram`, or `discord`. |
|
||||
|
||||
## `hermes claw`
|
||||
|
||||
```bash
|
||||
hermes claw migrate
|
||||
```
|
||||
|
||||
Used to migrate settings, memories, skills, and keys from OpenClaw to Hermes.
|
||||
|
||||
## Maintenance commands
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `hermes version` | Print version information. |
|
||||
| `hermes update` | Pull latest changes and reinstall dependencies. |
|
||||
| `hermes uninstall [--full] [--yes]` | Remove Hermes, optionally deleting all config/data. |
|
||||
|
||||
## See also
|
||||
|
||||
- [Slash Commands Reference](./slash-commands.md)
|
||||
- [CLI Interface](../user-guide/cli.md)
|
||||
- [Sessions](../user-guide/sessions.md)
|
||||
- [Skills System](../user-guide/features/skills.md)
|
||||
- [Skins & Themes](../user-guide/features/skins.md)
|
||||
302
hermes_code/website/docs/reference/environment-variables.md
Normal file
302
hermes_code/website/docs/reference/environment-variables.md
Normal file
|
|
@ -0,0 +1,302 @@
|
|||
---
|
||||
sidebar_position: 2
|
||||
title: "Environment Variables"
|
||||
description: "Complete reference of all environment variables used by Hermes Agent"
|
||||
---
|
||||
|
||||
# Environment Variables Reference
|
||||
|
||||
All variables go in `~/.hermes/.env`. You can also set them with `hermes config set VAR value`.
|
||||
|
||||
## LLM Providers
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `OPENROUTER_API_KEY` | OpenRouter API key (recommended for flexibility) |
|
||||
| `OPENROUTER_BASE_URL` | Override the OpenRouter-compatible base URL |
|
||||
| `AI_GATEWAY_API_KEY` | Vercel AI Gateway API key ([ai-gateway.vercel.sh](https://ai-gateway.vercel.sh)) |
|
||||
| `AI_GATEWAY_BASE_URL` | Override AI Gateway base URL (default: `https://ai-gateway.vercel.sh/v1`) |
|
||||
| `OPENAI_API_KEY` | API key for custom OpenAI-compatible endpoints (used with `OPENAI_BASE_URL`) |
|
||||
| `OPENAI_BASE_URL` | Base URL for custom endpoint (VLLM, SGLang, etc.) |
|
||||
| `COPILOT_GITHUB_TOKEN` | GitHub token for Copilot API — first priority (OAuth `gho_*` or fine-grained PAT `github_pat_*`; classic PATs `ghp_*` are **not supported**) |
|
||||
| `GH_TOKEN` | GitHub token — second priority for Copilot (also used by `gh` CLI) |
|
||||
| `GITHUB_TOKEN` | GitHub token — third priority for Copilot |
|
||||
| `HERMES_COPILOT_ACP_COMMAND` | Override Copilot ACP CLI binary path (default: `copilot`) |
|
||||
| `COPILOT_CLI_PATH` | Alias for `HERMES_COPILOT_ACP_COMMAND` |
|
||||
| `HERMES_COPILOT_ACP_ARGS` | Override Copilot ACP arguments (default: `--acp --stdio`) |
|
||||
| `COPILOT_ACP_BASE_URL` | Override Copilot ACP base URL |
|
||||
| `GLM_API_KEY` | z.ai / ZhipuAI GLM API key ([z.ai](https://z.ai)) |
|
||||
| `ZAI_API_KEY` | Alias for `GLM_API_KEY` |
|
||||
| `Z_AI_API_KEY` | Alias for `GLM_API_KEY` |
|
||||
| `GLM_BASE_URL` | Override z.ai base URL (default: `https://api.z.ai/api/paas/v4`) |
|
||||
| `KIMI_API_KEY` | Kimi / Moonshot AI API key ([moonshot.ai](https://platform.moonshot.ai)) |
|
||||
| `KIMI_BASE_URL` | Override Kimi base URL (default: `https://api.moonshot.ai/v1`) |
|
||||
| `MINIMAX_API_KEY` | MiniMax API key — global endpoint ([minimax.io](https://www.minimax.io)) |
|
||||
| `MINIMAX_BASE_URL` | Override MiniMax base URL (default: `https://api.minimax.io/v1`) |
|
||||
| `MINIMAX_CN_API_KEY` | MiniMax API key — China endpoint ([minimaxi.com](https://www.minimaxi.com)) |
|
||||
| `MINIMAX_CN_BASE_URL` | Override MiniMax China base URL (default: `https://api.minimaxi.com/v1`) |
|
||||
| `KILOCODE_API_KEY` | Kilo Code API key ([kilo.ai](https://kilo.ai)) |
|
||||
| `KILOCODE_BASE_URL` | Override Kilo Code base URL (default: `https://api.kilo.ai/api/gateway`) |
|
||||
| `ANTHROPIC_API_KEY` | Anthropic Console API key ([console.anthropic.com](https://console.anthropic.com/)) |
|
||||
| `ANTHROPIC_TOKEN` | Manual or legacy Anthropic OAuth/setup-token override |
|
||||
| `DASHSCOPE_API_KEY` | Alibaba Cloud DashScope API key for Qwen models ([modelstudio.console.alibabacloud.com](https://modelstudio.console.alibabacloud.com/)) |
|
||||
| `DASHSCOPE_BASE_URL` | Custom DashScope base URL (default: international endpoint) |
|
||||
| `DEEPSEEK_API_KEY` | DeepSeek API key for direct DeepSeek access ([platform.deepseek.com](https://platform.deepseek.com/api_keys)) |
|
||||
| `DEEPSEEK_BASE_URL` | Custom DeepSeek API base URL |
|
||||
| `OPENCODE_ZEN_API_KEY` | OpenCode Zen API key — pay-as-you-go access to curated models ([opencode.ai](https://opencode.ai/auth)) |
|
||||
| `OPENCODE_ZEN_BASE_URL` | Override OpenCode Zen base URL |
|
||||
| `OPENCODE_GO_API_KEY` | OpenCode Go API key — $10/month subscription for open models ([opencode.ai](https://opencode.ai/auth)) |
|
||||
| `OPENCODE_GO_BASE_URL` | Override OpenCode Go base URL |
|
||||
| `CLAUDE_CODE_OAUTH_TOKEN` | Explicit Claude Code token override if you export one manually |
|
||||
| `HERMES_MODEL` | Preferred model name (checked before `LLM_MODEL`, used by gateway) |
|
||||
| `LLM_MODEL` | Default model name (fallback when not set in config.yaml) |
|
||||
| `VOICE_TOOLS_OPENAI_KEY` | Preferred OpenAI key for OpenAI speech-to-text and text-to-speech providers |
|
||||
| `HERMES_LOCAL_STT_COMMAND` | Optional local speech-to-text command template. Supports `{input_path}`, `{output_dir}`, `{language}`, and `{model}` placeholders |
|
||||
| `HERMES_LOCAL_STT_LANGUAGE` | Default language passed to `HERMES_LOCAL_STT_COMMAND` or auto-detected local `whisper` CLI fallback (default: `en`) |
|
||||
| `HERMES_HOME` | Override Hermes config directory (default: `~/.hermes`). Also scopes the gateway PID file and systemd service name, so multiple installations can run concurrently |
|
||||
|
||||
## Provider Auth (OAuth)
|
||||
|
||||
For native Anthropic auth, Hermes prefers Claude Code's own credential files when they exist because those credentials can refresh automatically. Environment variables such as `ANTHROPIC_TOKEN` remain useful as manual overrides, but they are no longer the preferred path for Claude Pro/Max login.
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `kilocode` (default: `auto`) |
|
||||
| `HERMES_PORTAL_BASE_URL` | Override Nous Portal URL (for development/testing) |
|
||||
| `NOUS_INFERENCE_BASE_URL` | Override Nous inference API URL |
|
||||
| `HERMES_NOUS_MIN_KEY_TTL_SECONDS` | Min agent key TTL before re-mint (default: 1800 = 30min) |
|
||||
| `HERMES_NOUS_TIMEOUT_SECONDS` | HTTP timeout for Nous credential / token flows |
|
||||
| `HERMES_DUMP_REQUESTS` | Dump API request payloads to log files (`true`/`false`) |
|
||||
| `HERMES_PREFILL_MESSAGES_FILE` | Path to a JSON file of ephemeral prefill messages injected at API-call time |
|
||||
| `HERMES_TIMEZONE` | IANA timezone override (for example `America/New_York`) |
|
||||
|
||||
## Tool APIs
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `PARALLEL_API_KEY` | AI-native web search ([parallel.ai](https://parallel.ai/)) |
|
||||
| `FIRECRAWL_API_KEY` | Web scraping ([firecrawl.dev](https://firecrawl.dev/)) |
|
||||
| `FIRECRAWL_API_URL` | Custom Firecrawl API endpoint for self-hosted instances (optional) |
|
||||
| `TAVILY_API_KEY` | Tavily API key for AI-native web search, extract, and crawl ([app.tavily.com](https://app.tavily.com/home)) |
|
||||
| `BROWSERBASE_API_KEY` | Browser automation ([browserbase.com](https://browserbase.com/)) |
|
||||
| `BROWSERBASE_PROJECT_ID` | Browserbase project ID |
|
||||
| `BROWSER_USE_API_KEY` | Browser Use cloud browser API key ([browser-use.com](https://browser-use.com/)) |
|
||||
| `BROWSER_CDP_URL` | Chrome DevTools Protocol URL for local browser (set via `/browser connect`, e.g. `ws://localhost:9222`) |
|
||||
| `BROWSER_INACTIVITY_TIMEOUT` | Browser session inactivity timeout in seconds |
|
||||
| `FAL_KEY` | Image generation ([fal.ai](https://fal.ai/)) |
|
||||
| `GROQ_API_KEY` | Groq Whisper STT API key ([groq.com](https://groq.com/)) |
|
||||
| `ELEVENLABS_API_KEY` | ElevenLabs premium TTS voices ([elevenlabs.io](https://elevenlabs.io/)) |
|
||||
| `STT_GROQ_MODEL` | Override the Groq STT model (default: `whisper-large-v3-turbo`) |
|
||||
| `GROQ_BASE_URL` | Override the Groq OpenAI-compatible STT endpoint |
|
||||
| `STT_OPENAI_MODEL` | Override the OpenAI STT model (default: `whisper-1`) |
|
||||
| `STT_OPENAI_BASE_URL` | Override the OpenAI-compatible STT endpoint |
|
||||
| `GITHUB_TOKEN` | GitHub token for Skills Hub (higher API rate limits, skill publish) |
|
||||
| `HONCHO_API_KEY` | Cross-session user modeling ([honcho.dev](https://honcho.dev/)) |
|
||||
| `HONCHO_BASE_URL` | Base URL for self-hosted Honcho instances (default: Honcho cloud). No API key required for local instances |
|
||||
| `TINKER_API_KEY` | RL training ([tinker-console.thinkingmachines.ai](https://tinker-console.thinkingmachines.ai/)) |
|
||||
| `WANDB_API_KEY` | RL training metrics ([wandb.ai](https://wandb.ai/)) |
|
||||
| `DAYTONA_API_KEY` | Daytona cloud sandboxes ([daytona.io](https://daytona.io/)) |
|
||||
|
||||
## Terminal Backend
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `TERMINAL_ENV` | Backend: `local`, `docker`, `ssh`, `singularity`, `modal`, `daytona` |
|
||||
| `TERMINAL_DOCKER_IMAGE` | Docker image (default: `python:3.11`) |
|
||||
| `TERMINAL_DOCKER_FORWARD_ENV` | JSON array of env var names to explicitly forward into Docker terminal sessions |
|
||||
| `TERMINAL_DOCKER_VOLUMES` | Additional Docker volume mounts (comma-separated `host:container` pairs) |
|
||||
| `TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE` | Advanced opt-in: mount the launch cwd into Docker `/workspace` (`true`/`false`, default: `false`) |
|
||||
| `TERMINAL_SINGULARITY_IMAGE` | Singularity image or `.sif` path |
|
||||
| `TERMINAL_MODAL_IMAGE` | Modal container image |
|
||||
| `TERMINAL_DAYTONA_IMAGE` | Daytona sandbox image |
|
||||
| `TERMINAL_TIMEOUT` | Command timeout in seconds |
|
||||
| `TERMINAL_LIFETIME_SECONDS` | Max lifetime for terminal sessions in seconds |
|
||||
| `TERMINAL_CWD` | Working directory for all terminal sessions |
|
||||
| `SUDO_PASSWORD` | Enable sudo without interactive prompt |
|
||||
|
||||
## SSH Backend
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `TERMINAL_SSH_HOST` | Remote server hostname |
|
||||
| `TERMINAL_SSH_USER` | SSH username |
|
||||
| `TERMINAL_SSH_PORT` | SSH port (default: 22) |
|
||||
| `TERMINAL_SSH_KEY` | Path to private key |
|
||||
| `TERMINAL_SSH_PERSISTENT` | Override persistent shell for SSH (default: follows `TERMINAL_PERSISTENT_SHELL`) |
|
||||
|
||||
## Container Resources (Docker, Singularity, Modal, Daytona)
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `TERMINAL_CONTAINER_CPU` | CPU cores (default: 1) |
|
||||
| `TERMINAL_CONTAINER_MEMORY` | Memory in MB (default: 5120) |
|
||||
| `TERMINAL_CONTAINER_DISK` | Disk in MB (default: 51200) |
|
||||
| `TERMINAL_CONTAINER_PERSISTENT` | Persist container filesystem across sessions (default: `true`) |
|
||||
| `TERMINAL_SANDBOX_DIR` | Host directory for workspaces and overlays (default: `~/.hermes/sandboxes/`) |
|
||||
|
||||
## Persistent Shell
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `TERMINAL_PERSISTENT_SHELL` | Enable persistent shell for non-local backends (default: `true`). Also settable via `terminal.persistent_shell` in config.yaml |
|
||||
| `TERMINAL_LOCAL_PERSISTENT` | Enable persistent shell for local backend (default: `false`) |
|
||||
| `TERMINAL_SSH_PERSISTENT` | Override persistent shell for SSH backend (default: follows `TERMINAL_PERSISTENT_SHELL`) |
|
||||
|
||||
## Messaging
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `TELEGRAM_BOT_TOKEN` | Telegram bot token (from @BotFather) |
|
||||
| `TELEGRAM_ALLOWED_USERS` | Comma-separated user IDs allowed to use the bot |
|
||||
| `TELEGRAM_HOME_CHANNEL` | Default Telegram chat/channel for cron delivery |
|
||||
| `TELEGRAM_HOME_CHANNEL_NAME` | Display name for the Telegram home channel |
|
||||
| `DISCORD_BOT_TOKEN` | Discord bot token |
|
||||
| `DISCORD_ALLOWED_USERS` | Comma-separated Discord user IDs allowed to use the bot |
|
||||
| `DISCORD_HOME_CHANNEL` | Default Discord channel for cron delivery |
|
||||
| `DISCORD_HOME_CHANNEL_NAME` | Display name for the Discord home channel |
|
||||
| `DISCORD_REQUIRE_MENTION` | Require an @mention before responding in server channels |
|
||||
| `DISCORD_FREE_RESPONSE_CHANNELS` | Comma-separated channel IDs where mention is not required |
|
||||
| `DISCORD_AUTO_THREAD` | Auto-thread long replies when supported |
|
||||
| `SLACK_BOT_TOKEN` | Slack bot token (`xoxb-...`) |
|
||||
| `SLACK_APP_TOKEN` | Slack app-level token (`xapp-...`, required for Socket Mode) |
|
||||
| `SLACK_ALLOWED_USERS` | Comma-separated Slack user IDs |
|
||||
| `SLACK_HOME_CHANNEL` | Default Slack channel for cron delivery |
|
||||
| `SLACK_HOME_CHANNEL_NAME` | Display name for the Slack home channel |
|
||||
| `WHATSAPP_ENABLED` | Enable the WhatsApp bridge (`true`/`false`) |
|
||||
| `WHATSAPP_MODE` | `bot` (separate number) or `self-chat` (message yourself) |
|
||||
| `WHATSAPP_ALLOWED_USERS` | Comma-separated phone numbers (with country code, no `+`) |
|
||||
| `SIGNAL_HTTP_URL` | signal-cli daemon HTTP endpoint (for example `http://127.0.0.1:8080`) |
|
||||
| `SIGNAL_ACCOUNT` | Bot phone number in E.164 format |
|
||||
| `SIGNAL_ALLOWED_USERS` | Comma-separated E.164 phone numbers or UUIDs |
|
||||
| `SIGNAL_GROUP_ALLOWED_USERS` | Comma-separated group IDs, or `*` for all groups |
|
||||
| `SIGNAL_HOME_CHANNEL_NAME` | Display name for the Signal home channel |
|
||||
| `SIGNAL_IGNORE_STORIES` | Ignore Signal stories/status updates |
|
||||
| `SIGNAL_ALLOW_ALL_USERS` | Allow all Signal users without an allowlist |
|
||||
| `TWILIO_ACCOUNT_SID` | Twilio Account SID (shared with telephony skill) |
|
||||
| `TWILIO_AUTH_TOKEN` | Twilio Auth Token (shared with telephony skill) |
|
||||
| `TWILIO_PHONE_NUMBER` | Twilio phone number in E.164 format (shared with telephony skill) |
|
||||
| `SMS_WEBHOOK_PORT` | Webhook listener port for inbound SMS (default: `8080`) |
|
||||
| `SMS_ALLOWED_USERS` | Comma-separated E.164 phone numbers allowed to chat |
|
||||
| `SMS_ALLOW_ALL_USERS` | Allow all SMS senders without an allowlist |
|
||||
| `SMS_HOME_CHANNEL` | Phone number for cron job / notification delivery |
|
||||
| `SMS_HOME_CHANNEL_NAME` | Display name for the SMS home channel |
|
||||
| `EMAIL_ADDRESS` | Email address for the Email gateway adapter |
|
||||
| `EMAIL_PASSWORD` | Password or app password for the email account |
|
||||
| `EMAIL_IMAP_HOST` | IMAP hostname for the email adapter |
|
||||
| `EMAIL_IMAP_PORT` | IMAP port |
|
||||
| `EMAIL_SMTP_HOST` | SMTP hostname for the email adapter |
|
||||
| `EMAIL_SMTP_PORT` | SMTP port |
|
||||
| `EMAIL_ALLOWED_USERS` | Comma-separated email addresses allowed to message the bot |
|
||||
| `EMAIL_HOME_ADDRESS` | Default recipient for proactive email delivery |
|
||||
| `EMAIL_HOME_ADDRESS_NAME` | Display name for the email home target |
|
||||
| `EMAIL_POLL_INTERVAL` | Email polling interval in seconds |
|
||||
| `EMAIL_ALLOW_ALL_USERS` | Allow all inbound email senders |
|
||||
| `DINGTALK_CLIENT_ID` | DingTalk bot AppKey from developer portal ([open.dingtalk.com](https://open.dingtalk.com)) |
|
||||
| `DINGTALK_CLIENT_SECRET` | DingTalk bot AppSecret from developer portal |
|
||||
| `DINGTALK_ALLOWED_USERS` | Comma-separated DingTalk user IDs allowed to message the bot |
|
||||
| `MATTERMOST_URL` | Mattermost server URL (e.g. `https://mm.example.com`) |
|
||||
| `MATTERMOST_TOKEN` | Bot token or personal access token for Mattermost |
|
||||
| `MATTERMOST_ALLOWED_USERS` | Comma-separated Mattermost user IDs allowed to message the bot |
|
||||
| `MATTERMOST_HOME_CHANNEL` | Channel ID for proactive message delivery (cron, notifications) |
|
||||
| `MATTERMOST_REPLY_MODE` | Reply style: `thread` (threaded replies) or `off` (flat messages, default) |
|
||||
| `MATRIX_HOMESERVER` | Matrix homeserver URL (e.g. `https://matrix.org`) |
|
||||
| `MATRIX_ACCESS_TOKEN` | Matrix access token for bot authentication |
|
||||
| `MATRIX_USER_ID` | Matrix user ID (e.g. `@hermes:matrix.org`) — required for password login, optional with access token |
|
||||
| `MATRIX_PASSWORD` | Matrix password (alternative to access token) |
|
||||
| `MATRIX_ALLOWED_USERS` | Comma-separated Matrix user IDs allowed to message the bot (e.g. `@alice:matrix.org`) |
|
||||
| `MATRIX_HOME_ROOM` | Room ID for proactive message delivery (e.g. `!abc123:matrix.org`) |
|
||||
| `MATRIX_ENCRYPTION` | Enable end-to-end encryption (`true`/`false`, default: `false`) |
|
||||
| `HASS_TOKEN` | Home Assistant Long-Lived Access Token (enables HA platform + tools) |
|
||||
| `HASS_URL` | Home Assistant URL (default: `http://homeassistant.local:8123`) |
|
||||
| `WEBHOOK_ENABLED` | Enable the webhook platform adapter (`true`/`false`) |
|
||||
| `WEBHOOK_PORT` | HTTP server port for receiving webhooks (default: `8644`) |
|
||||
| `WEBHOOK_SECRET` | Global HMAC secret for webhook signature validation (used as fallback when routes don't specify their own) |
|
||||
| `API_SERVER_ENABLED` | Enable the OpenAI-compatible API server (`true`/`false`). Runs alongside other platforms. |
|
||||
| `API_SERVER_KEY` | Bearer token for API server authentication. Strongly recommended; required for any network-accessible deployment. |
|
||||
| `API_SERVER_CORS_ORIGINS` | Comma-separated browser origins allowed to call the API server directly (for example `http://localhost:3000,http://127.0.0.1:3000`). Default: disabled. |
|
||||
| `API_SERVER_PORT` | Port for the API server (default: `8642`) |
|
||||
| `API_SERVER_HOST` | Host/bind address for the API server (default: `127.0.0.1`). Use `0.0.0.0` for network access only with `API_SERVER_KEY` and a narrow `API_SERVER_CORS_ORIGINS` allowlist. |
|
||||
| `MESSAGING_CWD` | Working directory for terminal commands in messaging mode (default: `~`) |
|
||||
| `GATEWAY_ALLOWED_USERS` | Comma-separated user IDs allowed across all platforms |
|
||||
| `GATEWAY_ALLOW_ALL_USERS` | Allow all users without allowlists (`true`/`false`, default: `false`) |
|
||||
|
||||
## Agent Behavior
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `HERMES_MAX_ITERATIONS` | Max tool-calling iterations per conversation (default: 90) |
|
||||
| `HERMES_TOOL_PROGRESS` | Deprecated compatibility variable for tool progress display. Prefer `display.tool_progress` in `config.yaml`. |
|
||||
| `HERMES_TOOL_PROGRESS_MODE` | Deprecated compatibility variable for tool progress mode. Prefer `display.tool_progress` in `config.yaml`. |
|
||||
| `HERMES_HUMAN_DELAY_MODE` | Response pacing: `off`/`natural`/`custom` |
|
||||
| `HERMES_HUMAN_DELAY_MIN_MS` | Custom delay range minimum (ms) |
|
||||
| `HERMES_HUMAN_DELAY_MAX_MS` | Custom delay range maximum (ms) |
|
||||
| `HERMES_QUIET` | Suppress non-essential output (`true`/`false`) |
|
||||
| `HERMES_API_TIMEOUT` | LLM API call timeout in seconds (default: `900`) |
|
||||
| `HERMES_EXEC_ASK` | Enable execution approval prompts in gateway mode (`true`/`false`) |
|
||||
| `HERMES_ENABLE_PROJECT_PLUGINS` | Enable auto-discovery of repo-local plugins from `./.hermes/plugins/` (`true`/`false`, default: `false`) |
|
||||
| `HERMES_BACKGROUND_NOTIFICATIONS` | Background process notification mode in gateway: `all` (default), `result`, `error`, `off` |
|
||||
| `HERMES_EPHEMERAL_SYSTEM_PROMPT` | Ephemeral system prompt injected at API-call time (never persisted to sessions) |
|
||||
|
||||
## Session Settings
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `SESSION_IDLE_MINUTES` | Reset sessions after N minutes of inactivity (default: 1440) |
|
||||
| `SESSION_RESET_HOUR` | Daily reset hour in 24h format (default: 4 = 4am) |
|
||||
|
||||
## Context Compression (config.yaml only)
|
||||
|
||||
Context compression is configured exclusively through the `compression` section in `config.yaml` — there are no environment variables for it.
|
||||
|
||||
```yaml
|
||||
compression:
|
||||
enabled: true
|
||||
threshold: 0.50
|
||||
summary_model: google/gemini-3-flash-preview
|
||||
summary_provider: auto
|
||||
summary_base_url: null # Custom OpenAI-compatible endpoint for summaries
|
||||
```
|
||||
|
||||
## Auxiliary Task Overrides
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `AUXILIARY_VISION_PROVIDER` | Override provider for vision tasks |
|
||||
| `AUXILIARY_VISION_MODEL` | Override model for vision tasks |
|
||||
| `AUXILIARY_VISION_BASE_URL` | Direct OpenAI-compatible endpoint for vision tasks |
|
||||
| `AUXILIARY_VISION_API_KEY` | API key paired with `AUXILIARY_VISION_BASE_URL` |
|
||||
| `AUXILIARY_WEB_EXTRACT_PROVIDER` | Override provider for web extraction/summarization |
|
||||
| `AUXILIARY_WEB_EXTRACT_MODEL` | Override model for web extraction/summarization |
|
||||
| `AUXILIARY_WEB_EXTRACT_BASE_URL` | Direct OpenAI-compatible endpoint for web extraction/summarization |
|
||||
| `AUXILIARY_WEB_EXTRACT_API_KEY` | API key paired with `AUXILIARY_WEB_EXTRACT_BASE_URL` |
|
||||
|
||||
For task-specific direct endpoints, Hermes uses the task's configured API key or `OPENAI_API_KEY`. It does not reuse `OPENROUTER_API_KEY` for those custom endpoints.
|
||||
|
||||
## Fallback Model (config.yaml only)
|
||||
|
||||
The primary model fallback is configured exclusively through `config.yaml` — there are no environment variables for it. Add a `fallback_model` section with `provider` and `model` keys to enable automatic failover when your main model encounters errors.
|
||||
|
||||
```yaml
|
||||
fallback_model:
|
||||
provider: openrouter
|
||||
model: anthropic/claude-sonnet-4
|
||||
```
|
||||
|
||||
See [Fallback Providers](/docs/user-guide/features/fallback-providers) for full details.
|
||||
|
||||
## Provider Routing (config.yaml only)
|
||||
|
||||
These go in `~/.hermes/config.yaml` under the `provider_routing` section:
|
||||
|
||||
| Key | Description |
|
||||
|-----|-------------|
|
||||
| `sort` | Sort providers: `"price"` (default), `"throughput"`, or `"latency"` |
|
||||
| `only` | List of provider slugs to allow (e.g., `["anthropic", "google"]`) |
|
||||
| `ignore` | List of provider slugs to skip |
|
||||
| `order` | List of provider slugs to try in order |
|
||||
| `require_parameters` | Only use providers supporting all request params (`true`/`false`) |
|
||||
| `data_collection` | `"allow"` (default) or `"deny"` to exclude data-storing providers |
|
||||
|
||||
:::tip
|
||||
Use `hermes config set` to set environment variables — it automatically saves them to the right file (`.env` for secrets, `config.yaml` for everything else).
|
||||
:::
|
||||
481
hermes_code/website/docs/reference/faq.md
Normal file
481
hermes_code/website/docs/reference/faq.md
Normal file
|
|
@ -0,0 +1,481 @@
|
|||
---
|
||||
sidebar_position: 3
|
||||
title: "FAQ & Troubleshooting"
|
||||
description: "Frequently asked questions and solutions to common issues with Hermes Agent"
|
||||
---
|
||||
|
||||
# FAQ & Troubleshooting
|
||||
|
||||
Quick answers and fixes for the most common questions and issues.
|
||||
|
||||
---
|
||||
|
||||
## Frequently Asked Questions
|
||||
|
||||
### What LLM providers work with Hermes?
|
||||
|
||||
Hermes Agent works with any OpenAI-compatible API. Supported providers include:
|
||||
|
||||
- **[OpenRouter](https://openrouter.ai/)** — access hundreds of models through one API key (recommended for flexibility)
|
||||
- **Nous Portal** — Nous Research's own inference endpoint
|
||||
- **OpenAI** — GPT-4o, o1, o3, etc.
|
||||
- **Anthropic** — Claude models (via OpenRouter or compatible proxy)
|
||||
- **Google** — Gemini models (via OpenRouter or compatible proxy)
|
||||
- **z.ai / ZhipuAI** — GLM models
|
||||
- **Kimi / Moonshot AI** — Kimi models
|
||||
- **MiniMax** — global and China endpoints
|
||||
- **Local models** — via [Ollama](https://ollama.com/), [vLLM](https://docs.vllm.ai/), [llama.cpp](https://github.com/ggerganov/llama.cpp), [SGLang](https://github.com/sgl-project/sglang), or any OpenAI-compatible server
|
||||
|
||||
Set your provider with `hermes model` or by editing `~/.hermes/.env`. See the [Environment Variables](./environment-variables.md) reference for all provider keys.
|
||||
|
||||
### Does it work on Windows?
|
||||
|
||||
**Not natively.** Hermes Agent requires a Unix-like environment. On Windows, install [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install) and run Hermes from inside it. The standard install command works perfectly in WSL2:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
|
||||
```
|
||||
|
||||
### Is my data sent anywhere?
|
||||
|
||||
API calls go **only to the LLM provider you configure** (e.g., OpenRouter, your local Ollama instance). Hermes Agent does not collect telemetry, usage data, or analytics. Your conversations, memory, and skills are stored locally in `~/.hermes/`.
|
||||
|
||||
### Can I use it offline / with local models?
|
||||
|
||||
Yes. Run `hermes model`, select **Custom endpoint**, and enter your server's URL:
|
||||
|
||||
```bash
|
||||
hermes model
|
||||
# Select: Custom endpoint (enter URL manually)
|
||||
# API base URL: http://localhost:11434/v1
|
||||
# API key: ollama
|
||||
# Model name: qwen3.5:27b
|
||||
# Context length: 32768 ← set this to match your server's actual context window
|
||||
```
|
||||
|
||||
Or configure it directly in `config.yaml`:
|
||||
|
||||
```yaml
|
||||
model:
|
||||
default: qwen3.5:27b
|
||||
provider: custom
|
||||
base_url: http://localhost:11434/v1
|
||||
```
|
||||
|
||||
Hermes persists the endpoint, provider, and base URL in `config.yaml` so it survives restarts. If your local server has exactly one model loaded, `/model custom` auto-detects it. You can also set `provider: custom` in config.yaml — it's a first-class provider, not an alias for anything else.
|
||||
|
||||
This works with Ollama, vLLM, llama.cpp server, SGLang, LocalAI, and others. See the [Configuration guide](../user-guide/configuration.md) for details.
|
||||
|
||||
:::tip Ollama users
|
||||
If you set a custom `num_ctx` in Ollama (e.g., `ollama run --num_ctx 16384`), make sure to set the matching context length in Hermes — Ollama's `/api/show` reports the model's *maximum* context, not the effective `num_ctx` you configured.
|
||||
:::
|
||||
|
||||
### How much does it cost?
|
||||
|
||||
Hermes Agent itself is **free and open-source** (MIT license). You pay only for the LLM API usage from your chosen provider. Local models are completely free to run.
|
||||
|
||||
### Can multiple people use one instance?
|
||||
|
||||
Yes. The [messaging gateway](../user-guide/messaging/index.md) lets multiple users interact with the same Hermes Agent instance via Telegram, Discord, Slack, WhatsApp, or Home Assistant. Access is controlled through allowlists (specific user IDs) and DM pairing (first user to message claims access).
|
||||
|
||||
### What's the difference between memory and skills?
|
||||
|
||||
- **Memory** stores **facts** — things the agent knows about you, your projects, and preferences. Memories are retrieved automatically based on relevance.
|
||||
- **Skills** store **procedures** — step-by-step instructions for how to do things. Skills are recalled when the agent encounters a similar task.
|
||||
|
||||
Both persist across sessions. See [Memory](../user-guide/features/memory.md) and [Skills](../user-guide/features/skills.md) for details.
|
||||
|
||||
### Can I use it in my own Python project?
|
||||
|
||||
Yes. Import the `AIAgent` class and use Hermes programmatically:
|
||||
|
||||
```python
|
||||
from hermes.agent import AIAgent
|
||||
|
||||
agent = AIAgent(model="openrouter/nous/hermes-3-llama-3.1-70b")
|
||||
response = agent.chat("Explain quantum computing briefly")
|
||||
```
|
||||
|
||||
See the [Python Library guide](../user-guide/features/code-execution.md) for full API usage.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Installation Issues
|
||||
|
||||
#### `hermes: command not found` after installation
|
||||
|
||||
**Cause:** Your shell hasn't reloaded the updated PATH.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Reload your shell profile
|
||||
source ~/.bashrc # bash
|
||||
source ~/.zshrc # zsh
|
||||
|
||||
# Or start a new terminal session
|
||||
```
|
||||
|
||||
If it still doesn't work, verify the install location:
|
||||
```bash
|
||||
which hermes
|
||||
ls ~/.local/bin/hermes
|
||||
```
|
||||
|
||||
:::tip
|
||||
The installer adds `~/.local/bin` to your PATH. If you use a non-standard shell config, add `export PATH="$HOME/.local/bin:$PATH"` manually.
|
||||
:::
|
||||
|
||||
#### Python version too old
|
||||
|
||||
**Cause:** Hermes requires Python 3.11 or newer.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
python3 --version # Check current version
|
||||
|
||||
# Install a newer Python
|
||||
sudo apt install python3.12 # Ubuntu/Debian
|
||||
brew install python@3.12 # macOS
|
||||
```
|
||||
|
||||
The installer handles this automatically — if you see this error during manual installation, upgrade Python first.
|
||||
|
||||
#### `uv: command not found`
|
||||
|
||||
**Cause:** The `uv` package manager isn't installed or not in PATH.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
source ~/.bashrc
|
||||
```
|
||||
|
||||
#### Permission denied errors during install
|
||||
|
||||
**Cause:** Insufficient permissions to write to the install directory.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Don't use sudo with the installer — it installs to ~/.local/bin
|
||||
# If you previously installed with sudo, clean up:
|
||||
sudo rm /usr/local/bin/hermes
|
||||
# Then re-run the standard installer
|
||||
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Provider & Model Issues
|
||||
|
||||
#### API key not working
|
||||
|
||||
**Cause:** Key is missing, expired, incorrectly set, or for the wrong provider.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Check your configuration
|
||||
hermes config show
|
||||
|
||||
# Re-configure your provider
|
||||
hermes model
|
||||
|
||||
# Or set directly
|
||||
hermes config set OPENROUTER_API_KEY sk-or-v1-xxxxxxxxxxxx
|
||||
```
|
||||
|
||||
:::warning
|
||||
Make sure the key matches the provider. An OpenAI key won't work with OpenRouter and vice versa. Check `~/.hermes/.env` for conflicting entries.
|
||||
:::
|
||||
|
||||
#### Model not available / model not found
|
||||
|
||||
**Cause:** The model identifier is incorrect or not available on your provider.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# List available models for your provider
|
||||
hermes model
|
||||
|
||||
# Set a valid model
|
||||
hermes config set HERMES_MODEL openrouter/nous/hermes-3-llama-3.1-70b
|
||||
|
||||
# Or specify per-session
|
||||
hermes chat --model openrouter/meta-llama/llama-3.1-70b-instruct
|
||||
```
|
||||
|
||||
#### Rate limiting (429 errors)
|
||||
|
||||
**Cause:** You've exceeded your provider's rate limits.
|
||||
|
||||
**Solution:** Wait a moment and retry. For sustained usage, consider:
|
||||
- Upgrading your provider plan
|
||||
- Switching to a different model or provider
|
||||
- Using `hermes chat --provider <alternative>` to route to a different backend
|
||||
|
||||
#### Context length exceeded
|
||||
|
||||
**Cause:** The conversation has grown too long for the model's context window, or Hermes detected the wrong context length for your model.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Compress the current session
|
||||
/compress
|
||||
|
||||
# Or start a fresh session
|
||||
hermes chat
|
||||
|
||||
# Use a model with a larger context window
|
||||
hermes chat --model openrouter/google/gemini-2.0-flash-001
|
||||
```
|
||||
|
||||
If this happens on the first long conversation, Hermes may have the wrong context length for your model. Check what it detected:
|
||||
|
||||
Look at the CLI startup line — it shows the detected context length (e.g., `📊 Context limit: 128000 tokens`). You can also check with `/usage` during a session.
|
||||
|
||||
To fix context detection, set it explicitly:
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
model:
|
||||
default: your-model-name
|
||||
context_length: 131072 # your model's actual context window
|
||||
```
|
||||
|
||||
Or for custom endpoints, add it per-model:
|
||||
|
||||
```yaml
|
||||
custom_providers:
|
||||
- name: "My Server"
|
||||
base_url: "http://localhost:11434/v1"
|
||||
models:
|
||||
qwen3.5:27b:
|
||||
context_length: 32768
|
||||
```
|
||||
|
||||
See [Context Length Detection](../user-guide/configuration.md#context-length-detection) for how auto-detection works and all override options.
|
||||
|
||||
---
|
||||
|
||||
### Terminal Issues
|
||||
|
||||
#### Command blocked as dangerous
|
||||
|
||||
**Cause:** Hermes detected a potentially destructive command (e.g., `rm -rf`, `DROP TABLE`). This is a safety feature.
|
||||
|
||||
**Solution:** When prompted, review the command and type `y` to approve it. You can also:
|
||||
- Ask the agent to use a safer alternative
|
||||
- See the full list of dangerous patterns in the [Security docs](../user-guide/security.md)
|
||||
|
||||
:::tip
|
||||
This is working as intended — Hermes never silently runs destructive commands. The approval prompt shows you exactly what will execute.
|
||||
:::
|
||||
|
||||
#### `sudo` not working via messaging gateway
|
||||
|
||||
**Cause:** The messaging gateway runs without an interactive terminal, so `sudo` cannot prompt for a password.
|
||||
|
||||
**Solution:**
|
||||
- Avoid `sudo` in messaging — ask the agent to find alternatives
|
||||
- If you must use `sudo`, configure passwordless sudo for specific commands in `/etc/sudoers`
|
||||
- Or switch to the terminal interface for administrative tasks: `hermes chat`
|
||||
|
||||
#### Docker backend not connecting
|
||||
|
||||
**Cause:** Docker daemon isn't running or the user lacks permissions.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Check Docker is running
|
||||
docker info
|
||||
|
||||
# Add your user to the docker group
|
||||
sudo usermod -aG docker $USER
|
||||
newgrp docker
|
||||
|
||||
# Verify
|
||||
docker run hello-world
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Messaging Issues
|
||||
|
||||
#### Bot not responding to messages
|
||||
|
||||
**Cause:** The bot isn't running, isn't authorized, or your user isn't in the allowlist.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Check if the gateway is running
|
||||
hermes gateway status
|
||||
|
||||
# Start the gateway
|
||||
hermes gateway start
|
||||
|
||||
# Check logs for errors
|
||||
cat ~/.hermes/logs/gateway.log | tail -50
|
||||
```
|
||||
|
||||
#### Messages not delivering
|
||||
|
||||
**Cause:** Network issues, bot token expired, or platform webhook misconfiguration.
|
||||
|
||||
**Solution:**
|
||||
- Verify your bot token is valid with `hermes gateway setup`
|
||||
- Check gateway logs: `cat ~/.hermes/logs/gateway.log | tail -50`
|
||||
- For webhook-based platforms (Slack, WhatsApp), ensure your server is publicly accessible
|
||||
|
||||
#### Allowlist confusion — who can talk to the bot?
|
||||
|
||||
**Cause:** Authorization mode determines who gets access.
|
||||
|
||||
**Solution:**
|
||||
|
||||
| Mode | How it works |
|
||||
|------|-------------|
|
||||
| **Allowlist** | Only user IDs listed in config can interact |
|
||||
| **DM pairing** | First user to message in DM claims exclusive access |
|
||||
| **Open** | Anyone can interact (not recommended for production) |
|
||||
|
||||
Configure in `~/.hermes/config.yaml` under your gateway's settings. See the [Messaging docs](../user-guide/messaging/index.md).
|
||||
|
||||
#### Gateway won't start
|
||||
|
||||
**Cause:** Missing dependencies, port conflicts, or misconfigured tokens.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Install messaging dependencies
|
||||
pip install "hermes-agent[telegram]" # or [discord], [slack], [whatsapp]
|
||||
|
||||
# Check for port conflicts
|
||||
lsof -i :8080
|
||||
|
||||
# Verify configuration
|
||||
hermes config show
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Performance Issues
|
||||
|
||||
#### Slow responses
|
||||
|
||||
**Cause:** Large model, distant API server, or heavy system prompt with many tools.
|
||||
|
||||
**Solution:**
|
||||
- Try a faster/smaller model: `hermes chat --model openrouter/meta-llama/llama-3.1-8b-instruct`
|
||||
- Reduce active toolsets: `hermes chat -t "terminal"`
|
||||
- Check your network latency to the provider
|
||||
- For local models, ensure you have enough GPU VRAM
|
||||
|
||||
#### High token usage
|
||||
|
||||
**Cause:** Long conversations, verbose system prompts, or many tool calls accumulating context.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Compress the conversation to reduce tokens
|
||||
/compress
|
||||
|
||||
# Check session token usage
|
||||
/usage
|
||||
```
|
||||
|
||||
:::tip
|
||||
Use `/compress` regularly during long sessions. It summarizes the conversation history and reduces token usage significantly while preserving context.
|
||||
:::
|
||||
|
||||
#### Session getting too long
|
||||
|
||||
**Cause:** Extended conversations accumulate messages and tool outputs, approaching context limits.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Compress current session (preserves key context)
|
||||
/compress
|
||||
|
||||
# Start a new session with a reference to the old one
|
||||
hermes chat
|
||||
|
||||
# Resume a specific session later if needed
|
||||
hermes chat --continue
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### MCP Issues
|
||||
|
||||
#### MCP server not connecting
|
||||
|
||||
**Cause:** Server binary not found, wrong command path, or missing runtime.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Ensure MCP dependencies are installed (already included in standard install)
|
||||
cd ~/.hermes/hermes-agent && uv pip install -e ".[mcp]"
|
||||
|
||||
# For npm-based servers, ensure Node.js is available
|
||||
node --version
|
||||
npx --version
|
||||
|
||||
# Test the server manually
|
||||
npx -y @modelcontextprotocol/server-filesystem /tmp
|
||||
```
|
||||
|
||||
Verify your `~/.hermes/config.yaml` MCP configuration:
|
||||
```yaml
|
||||
mcp_servers:
|
||||
filesystem:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/docs"]
|
||||
```
|
||||
|
||||
#### Tools not showing up from MCP server
|
||||
|
||||
**Cause:** Server started but tool discovery failed, tools were filtered out by config, or the server does not support the MCP capability you expected.
|
||||
|
||||
**Solution:**
|
||||
- Check gateway/agent logs for MCP connection errors
|
||||
- Ensure the server responds to the `tools/list` RPC method
|
||||
- Review any `tools.include`, `tools.exclude`, `tools.resources`, `tools.prompts`, or `enabled` settings under that server
|
||||
- Remember that resource/prompt utility tools are only registered when the session actually supports those capabilities
|
||||
- Use `/reload-mcp` after changing config
|
||||
|
||||
```bash
|
||||
# Verify MCP servers are configured
|
||||
hermes config show | grep -A 12 mcp_servers
|
||||
|
||||
# Restart Hermes or reload MCP after config changes
|
||||
hermes chat
|
||||
```
|
||||
|
||||
See also:
|
||||
- [MCP (Model Context Protocol)](/docs/user-guide/features/mcp)
|
||||
- [Use MCP with Hermes](/docs/guides/use-mcp-with-hermes)
|
||||
- [MCP Config Reference](/docs/reference/mcp-config-reference)
|
||||
|
||||
#### MCP timeout errors
|
||||
|
||||
**Cause:** The MCP server is taking too long to respond, or it crashed during execution.
|
||||
|
||||
**Solution:**
|
||||
- Increase the timeout in your MCP server config if supported
|
||||
- Check if the MCP server process is still running
|
||||
- For remote HTTP MCP servers, check network connectivity
|
||||
|
||||
:::warning
|
||||
If an MCP server crashes mid-request, Hermes will report a timeout. Check the server's own logs (not just Hermes logs) to diagnose the root cause.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Still Stuck?
|
||||
|
||||
If your issue isn't covered here:
|
||||
|
||||
1. **Search existing issues:** [GitHub Issues](https://github.com/NousResearch/hermes-agent/issues)
|
||||
2. **Ask the community:** [Nous Research Discord](https://discord.gg/nousresearch)
|
||||
3. **File a bug report:** Include your OS, Python version (`python3 --version`), Hermes version (`hermes --version`), and the full error message
|
||||
215
hermes_code/website/docs/reference/mcp-config-reference.md
Normal file
215
hermes_code/website/docs/reference/mcp-config-reference.md
Normal file
|
|
@ -0,0 +1,215 @@
|
|||
---
|
||||
sidebar_position: 8
|
||||
title: "MCP Config Reference"
|
||||
description: "Reference for Hermes Agent MCP configuration keys, filtering semantics, and utility-tool policy"
|
||||
---
|
||||
|
||||
# MCP Config Reference
|
||||
|
||||
This page is the compact reference companion to the main MCP docs.
|
||||
|
||||
For conceptual guidance, see:
|
||||
- [MCP (Model Context Protocol)](/docs/user-guide/features/mcp)
|
||||
- [Use MCP with Hermes](/docs/guides/use-mcp-with-hermes)
|
||||
|
||||
## Root config shape
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
<server_name>:
|
||||
command: "..." # stdio servers
|
||||
args: []
|
||||
env: {}
|
||||
|
||||
# OR
|
||||
url: "..." # HTTP servers
|
||||
headers: {}
|
||||
|
||||
enabled: true
|
||||
timeout: 120
|
||||
connect_timeout: 60
|
||||
tools:
|
||||
include: []
|
||||
exclude: []
|
||||
resources: true
|
||||
prompts: true
|
||||
```
|
||||
|
||||
## Server keys
|
||||
|
||||
| Key | Type | Applies to | Meaning |
|
||||
|---|---|---|---|
|
||||
| `command` | string | stdio | Executable to launch |
|
||||
| `args` | list | stdio | Arguments for the subprocess |
|
||||
| `env` | mapping | stdio | Environment passed to the subprocess |
|
||||
| `url` | string | HTTP | Remote MCP endpoint |
|
||||
| `headers` | mapping | HTTP | Headers for remote server requests |
|
||||
| `enabled` | bool | both | Skip the server entirely when false |
|
||||
| `timeout` | number | both | Tool call timeout |
|
||||
| `connect_timeout` | number | both | Initial connection timeout |
|
||||
| `tools` | mapping | both | Filtering and utility-tool policy |
|
||||
|
||||
## `tools` policy keys
|
||||
|
||||
| Key | Type | Meaning |
|
||||
|---|---|---|
|
||||
| `include` | string or list | Whitelist server-native MCP tools |
|
||||
| `exclude` | string or list | Blacklist server-native MCP tools |
|
||||
| `resources` | bool-like | Enable/disable `list_resources` + `read_resource` |
|
||||
| `prompts` | bool-like | Enable/disable `list_prompts` + `get_prompt` |
|
||||
|
||||
## Filtering semantics
|
||||
|
||||
### `include`
|
||||
|
||||
If `include` is set, only those server-native MCP tools are registered.
|
||||
|
||||
```yaml
|
||||
tools:
|
||||
include: [create_issue, list_issues]
|
||||
```
|
||||
|
||||
### `exclude`
|
||||
|
||||
If `exclude` is set and `include` is not, every server-native MCP tool except those names is registered.
|
||||
|
||||
```yaml
|
||||
tools:
|
||||
exclude: [delete_customer]
|
||||
```
|
||||
|
||||
### Precedence
|
||||
|
||||
If both are set, `include` wins.
|
||||
|
||||
```yaml
|
||||
tools:
|
||||
include: [create_issue]
|
||||
exclude: [create_issue, delete_issue]
|
||||
```
|
||||
|
||||
Result:
|
||||
- `create_issue` is still allowed
|
||||
- `delete_issue` is ignored because `include` takes precedence
|
||||
|
||||
## Utility-tool policy
|
||||
|
||||
Hermes may register these utility wrappers per MCP server:
|
||||
|
||||
Resources:
|
||||
- `list_resources`
|
||||
- `read_resource`
|
||||
|
||||
Prompts:
|
||||
- `list_prompts`
|
||||
- `get_prompt`
|
||||
|
||||
### Disable resources
|
||||
|
||||
```yaml
|
||||
tools:
|
||||
resources: false
|
||||
```
|
||||
|
||||
### Disable prompts
|
||||
|
||||
```yaml
|
||||
tools:
|
||||
prompts: false
|
||||
```
|
||||
|
||||
### Capability-aware registration
|
||||
|
||||
Even when `resources: true` or `prompts: true`, Hermes only registers those utility tools if the MCP session actually exposes the corresponding capability.
|
||||
|
||||
So this is normal:
|
||||
- you enable prompts
|
||||
- but no prompt utilities appear
|
||||
- because the server does not support prompts
|
||||
|
||||
## `enabled: false`
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
legacy:
|
||||
url: "https://mcp.legacy.internal"
|
||||
enabled: false
|
||||
```
|
||||
|
||||
Behavior:
|
||||
- no connection attempt
|
||||
- no discovery
|
||||
- no tool registration
|
||||
- config remains in place for later reuse
|
||||
|
||||
## Empty result behavior
|
||||
|
||||
If filtering removes all server-native tools and no utility tools are registered, Hermes does not create an empty MCP runtime toolset for that server.
|
||||
|
||||
## Example configs
|
||||
|
||||
### Safe GitHub allowlist
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
|
||||
tools:
|
||||
include: [list_issues, create_issue, update_issue, search_code]
|
||||
resources: false
|
||||
prompts: false
|
||||
```
|
||||
|
||||
### Stripe blacklist
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
stripe:
|
||||
url: "https://mcp.stripe.com"
|
||||
headers:
|
||||
Authorization: "Bearer ***"
|
||||
tools:
|
||||
exclude: [delete_customer, refund_payment]
|
||||
```
|
||||
|
||||
### Resource-only docs server
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
docs:
|
||||
url: "https://mcp.docs.example.com"
|
||||
tools:
|
||||
include: []
|
||||
resources: true
|
||||
prompts: false
|
||||
```
|
||||
|
||||
## Reloading config
|
||||
|
||||
After changing MCP config, reload servers with:
|
||||
|
||||
```text
|
||||
/reload-mcp
|
||||
```
|
||||
|
||||
## Tool naming
|
||||
|
||||
Server-native MCP tools become:
|
||||
|
||||
```text
|
||||
mcp_<server>_<tool>
|
||||
```
|
||||
|
||||
Examples:
|
||||
- `mcp_github_create_issue`
|
||||
- `mcp_filesystem_read_file`
|
||||
- `mcp_my_api_query_data`
|
||||
|
||||
Utility tools follow the same prefixing pattern:
|
||||
- `mcp_<server>_list_resources`
|
||||
- `mcp_<server>_read_resource`
|
||||
- `mcp_<server>_list_prompts`
|
||||
- `mcp_<server>_get_prompt`
|
||||
|
|
@ -0,0 +1,74 @@
|
|||
---
|
||||
sidebar_position: 6
|
||||
title: "Official Optional Skills Catalog"
|
||||
description: "Catalog of official optional skills available from the repository"
|
||||
---
|
||||
|
||||
# Official Optional Skills Catalog
|
||||
|
||||
Official optional skills live in the repository under `optional-skills/`. Install them with `hermes skills install official/<category>/<skill>` or browse them with `hermes skills browse --source official`.
|
||||
|
||||
## autonomous-ai-agents
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `blackbox` | Delegate coding tasks to Blackbox AI CLI agent. Multi-model agent with built-in judge that runs tasks through multiple LLMs and picks the best result. Requires the blackbox CLI and a Blackbox AI API key. | `autonomous-ai-agents/blackbox` |
|
||||
|
||||
## blockchain
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `base` | Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection. | `blockchain/base` |
|
||||
| `solana` | Query Solana blockchain data with USD pricing — wallet balances, token portfolios with values, transaction details, NFTs, whale detection, and live network stats. Uses Solana RPC + CoinGecko. No API key required. | `blockchain/solana` |
|
||||
|
||||
## creative
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `blender-mcp` | Control Blender directly from Hermes via socket connection to the blender-mcp addon. Create 3D objects, materials, animations, and run arbitrary Blender Python. | `creative/blender-mcp` |
|
||||
| `meme-generation` | Generate real meme images by picking a template and overlaying text with Pillow. Produces actual .png meme files. | `creative/meme-generation` |
|
||||
|
||||
## email
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `agentmail` | Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses (e.g. hermes-agent@agentmail.to). | `email/agentmail` |
|
||||
|
||||
## health
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `neuroskill-bci` | Connect to a running NeuroSkill instance and incorporate the user's real-time cognitive and emotional state (focus, relaxation, mood, cognitive load, drowsiness, heart rate, HRV, sleep staging, and 40+ derived EXG scores) into responses. Requires a BCI wearable (Muse 2/S or Open… | `health/neuroskill-bci` |
|
||||
|
||||
## mcp
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `fastmcp` | Build, test, inspect, install, and deploy MCP servers with FastMCP in Python. | `mcp/fastmcp` |
|
||||
|
||||
## migration
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `openclaw-migration` | Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports Hermes-compatible memories, SOUL.md, command allowlists, user skills, and selected workspace assets from ~/.openclaw, then reports exactly what could not be migrated and why. | `migration/openclaw-migration` |
|
||||
|
||||
## productivity
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `telephony` | Give Hermes phone capabilities — provision a Twilio number, send/receive SMS/MMS, make direct calls, and place AI-driven outbound calls through Bland.ai or Vapi. | `productivity/telephony` |
|
||||
|
||||
## research
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `bioinformatics` | Gateway to 400+ bioinformatics skills from bioSkills and ClawBio. Covers genomics, transcriptomics, single-cell, variant calling, pharmacogenomics, metagenomics, structural biology. | `research/bioinformatics` |
|
||||
| `qmd` | Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. Supports CLI and MCP integration. | `research/qmd` |
|
||||
|
||||
## security
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `1password` | Set up and use 1Password CLI (op). Use when installing the CLI, enabling desktop app integration, signing in, and reading/injecting secrets for commands. | `security/1password` |
|
||||
| `oss-forensics` | Supply chain investigation, evidence recovery, and forensic analysis for GitHub repositories. Covers deleted commit recovery, force-push detection, IOC extraction. | `security/oss-forensics` |
|
||||
| `sherlock` | OSINT username search across 400+ social networks. Hunt down social media accounts by username. | `security/sherlock` |
|
||||
279
hermes_code/website/docs/reference/skills-catalog.md
Normal file
279
hermes_code/website/docs/reference/skills-catalog.md
Normal file
|
|
@ -0,0 +1,279 @@
|
|||
---
|
||||
sidebar_position: 5
|
||||
title: "Bundled Skills Catalog"
|
||||
description: "Catalog of bundled skills that ship with Hermes Agent"
|
||||
---
|
||||
|
||||
# Bundled Skills Catalog
|
||||
|
||||
Hermes ships with a large built-in skill library copied into `~/.hermes/skills/` on install. This page catalogs the bundled skills that live in the repository under `skills/`.
|
||||
|
||||
## apple
|
||||
|
||||
Apple/macOS-specific skills — iMessage, Reminders, Notes, FindMy, and macOS automation. These skills only load on macOS systems.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `apple-notes` | Manage Apple Notes via the memo CLI on macOS (create, view, search, edit). | `apple/apple-notes` |
|
||||
| `apple-reminders` | Manage Apple Reminders via remindctl CLI (list, add, complete, delete). | `apple/apple-reminders` |
|
||||
| `findmy` | Track Apple devices and AirTags via FindMy.app on macOS using AppleScript and screen capture. | `apple/findmy` |
|
||||
| `imessage` | Send and receive iMessages/SMS via the imsg CLI on macOS. | `apple/imessage` |
|
||||
|
||||
## autonomous-ai-agents
|
||||
|
||||
Skills for spawning and orchestrating autonomous AI coding agents and multi-agent workflows — running independent agent processes, delegating tasks, and coordinating parallel workstreams.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `claude-code` | Delegate coding tasks to Claude Code (Anthropic's CLI agent). Use for building features, refactoring, PR reviews, and iterative coding. Requires the claude CLI installed. | `autonomous-ai-agents/claude-code` |
|
||||
| `codex` | Delegate coding tasks to OpenAI Codex CLI agent. Use for building features, refactoring, PR reviews, and batch issue fixing. Requires the codex CLI and a git repository. | `autonomous-ai-agents/codex` |
|
||||
| `hermes-agent-spawning` | Spawn additional Hermes Agent instances as autonomous subprocesses for independent long-running tasks. Supports non-interactive one-shot mode (-q) and interactive PTY mode for multi-turn collaboration. Different from delegate_task — this runs a full separate hermes process. | `autonomous-ai-agents/hermes-agent` |
|
||||
| `opencode` | Delegate coding tasks to OpenCode CLI agent for feature implementation, refactoring, PR review, and long-running autonomous sessions. Requires the opencode CLI installed and authenticated. | `autonomous-ai-agents/opencode` |
|
||||
|
||||
## data-science
|
||||
|
||||
Skills for data science workflows — interactive exploration, Jupyter notebooks, data analysis, and visualization.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `jupyter-live-kernel` | Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb. Load this skill when the task involves exploration, iteration, or inspecting intermediate results. | `data-science/jupyter-live-kernel` |
|
||||
|
||||
## creative
|
||||
|
||||
Creative content generation — ASCII art, hand-drawn style diagrams, and visual design tools.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `ascii-art` | Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii.co.uk), and LLM fallback. No API keys required. | `creative/ascii-art` |
|
||||
| `ascii-video` | "Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid… | `creative/ascii-video` |
|
||||
| `excalidraw` | Create hand-drawn style diagrams using Excalidraw JSON format. Generate .excalidraw files for architecture diagrams, flowcharts, sequence diagrams, concept maps, and more. Files can be opened at excalidraw.com or uploaded for shareable links. | `creative/excalidraw` |
|
||||
|
||||
## dogfood
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `dogfood` | Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports. | `dogfood/dogfood` |
|
||||
| `hermes-agent-setup` | Help users configure Hermes Agent — CLI usage, setup wizard, model/provider selection, tools, skills, voice/STT/TTS, gateway, and troubleshooting. | `dogfood/hermes-agent-setup` |
|
||||
|
||||
## email
|
||||
|
||||
Skills for sending, receiving, searching, and managing email from the terminal.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `himalaya` | CLI to manage emails via IMAP/SMTP. Use himalaya to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language). | `email/himalaya` |
|
||||
|
||||
## gaming
|
||||
|
||||
Skills for setting up, configuring, and managing game servers, modpacks, and gaming-related infrastructure.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `minecraft-modpack-server` | Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip. Covers NeoForge/Forge install, Java version, JVM tuning, firewall, LAN config, backups, and launch scripts. | `gaming/minecraft-modpack-server` |
|
||||
| `pokemon-player` | Play Pokemon games autonomously via headless emulation. Starts a game server, reads structured game state from RAM, makes strategic decisions, and sends button inputs — all from the terminal. | `gaming/pokemon-player` |
|
||||
|
||||
## github
|
||||
|
||||
GitHub workflow skills for managing repositories, pull requests, code reviews, issues, and CI/CD pipelines using the gh CLI and git via terminal.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `codebase-inspection` | Inspect and analyze codebases using pygount for LOC counting, language breakdown, and code-vs-comment ratios. Use when asked to check lines of code, repo size, language composition, or codebase stats. | `github/codebase-inspection` |
|
||||
| `github-auth` | Set up GitHub authentication for the agent using git (universally available) or the gh CLI. Covers HTTPS tokens, SSH keys, credential helpers, and gh auth — with a detection flow to pick the right method automatically. | `github/github-auth` |
|
||||
| `github-code-review` | Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-code-review` |
|
||||
| `github-issues` | Create, manage, triage, and close GitHub issues. Search existing issues, add labels, assign people, and link to PRs. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-issues` |
|
||||
| `github-pr-workflow` | Full pull request lifecycle — create branches, commit changes, open PRs, monitor CI status, auto-fix failures, and merge. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-pr-workflow` |
|
||||
| `github-repo-management` | Clone, create, fork, configure, and manage GitHub repositories. Manage remotes, secrets, releases, and workflows. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-repo-management` |
|
||||
|
||||
## inference-sh
|
||||
|
||||
Skills for AI app execution via inference.sh cloud platform.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `inference-sh-cli` | Run 150+ AI apps via inference.sh CLI (infsh) — image generation, video creation, LLMs, search, 3D, social automation. | `inference-sh/cli` |
|
||||
|
||||
## leisure
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `find-nearby` | Find nearby places (restaurants, cafes, bars, pharmacies, etc.) using OpenStreetMap. Works with coordinates, addresses, cities, zip codes, or Telegram location pins. No API keys needed. | `leisure/find-nearby` |
|
||||
|
||||
## mcp
|
||||
|
||||
Skills for working with MCP (Model Context Protocol) servers, tools, and integrations. Includes the built-in native MCP client (configure servers in config.yaml for automatic tool discovery) and the mcporter CLI bridge for ad-hoc server interaction.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `mcporter` | Use the mcporter CLI to list, configure, auth, and call MCP servers/tools directly (HTTP or stdio), including ad-hoc servers, config edits, and CLI/type generation. | `mcp/mcporter` |
|
||||
| `native-mcp` | Built-in MCP (Model Context Protocol) client that connects to external MCP servers, discovers their tools, and registers them as native Hermes Agent tools. Supports stdio and HTTP transports with automatic reconnection, security filtering, and zero-config tool injection. | `mcp/native-mcp` |
|
||||
|
||||
## media
|
||||
|
||||
Skills for working with media content — YouTube transcripts, GIF search, music generation, and audio visualization.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `gif-search` | Search and download GIFs from Tenor using curl. No dependencies beyond curl and jq. Useful for finding reaction GIFs, creating visual content, and sending GIFs in chat. | `media/gif-search` |
|
||||
| `heartmula` | Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support. | `media/heartmula` |
|
||||
| `songsee` | Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation. | `media/songsee` |
|
||||
| `youtube-content` | Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts). | `media/youtube-content` |
|
||||
|
||||
## mlops
|
||||
|
||||
General-purpose ML operations tools — model hub management, dataset operations, and workflow orchestration.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `huggingface-hub` | Hugging Face Hub CLI (hf) — search, download, and upload models and datasets, manage repos, deploy inference endpoints. | `mlops/huggingface-hub` |
|
||||
|
||||
## mlops/cloud
|
||||
|
||||
GPU cloud providers and serverless compute platforms for ML workloads.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `lambda-labs-gpu-cloud` | Reserved and on-demand GPU cloud instances for ML training and inference. Use when you need dedicated GPU instances with simple SSH access, persistent filesystems, or high-performance multi-node clusters for large-scale training. | `mlops/cloud/lambda-labs` |
|
||||
| `modal-serverless-gpu` | Serverless GPU cloud platform for running ML workloads. Use when you need on-demand GPU access without infrastructure management, deploying ML models as APIs, or running batch jobs with automatic scaling. | `mlops/cloud/modal` |
|
||||
|
||||
## mlops/evaluation
|
||||
|
||||
Model evaluation benchmarks, experiment tracking, data curation, tokenizers, and interpretability tools.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `evaluating-llms-harness` | Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Sup… | `mlops/evaluation/lm-evaluation-harness` |
|
||||
| `huggingface-tokenizers` | Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track alignments, handle padding/truncation. Integrates seamlessly with transformers. Use… | `mlops/evaluation/huggingface-tokenizers` |
|
||||
| `nemo-curator` | GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality t… | `mlops/evaluation/nemo-curator` |
|
||||
| `sparse-autoencoder-training` | Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language m… | `mlops/evaluation/saelens` |
|
||||
| `weights-and-biases` | Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform | `mlops/evaluation/weights-and-biases` |
|
||||
|
||||
## mlops/inference
|
||||
|
||||
Model serving, quantization (GGUF/GPTQ), structured output, inference optimization, and model surgery tools for deploying and running LLMs.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `gguf-quantization` | GGUF format and llama.cpp quantization for efficient CPU/GPU inference. Use when deploying models on consumer hardware, Apple Silicon, or when needing flexible quantization from 2-8 bit without GPU requirements. | `mlops/inference/gguf` |
|
||||
| `guidance` | Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce structured formats, and build multi-step workflows with Guidance - Microsoft Research's constrained generation framework | `mlops/inference/guidance` |
|
||||
| `instructor` | Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, parse complex JSON with type safety, and stream partial results with Instructor - battle-tested structured output library | `mlops/inference/instructor` |
|
||||
| `llama-cpp` | Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization (1.5-8 bit) for reduced memory and 4-10× speedup vs PyTorch on CPU. | `mlops/inference/llama-cpp` |
|
||||
| `obliteratus` | Remove refusal behaviors from open-weight LLMs using OBLITERATUS — mechanistic interpretability techniques (diff-in-means, SVD, whitened SVD, LEACE, SAE decomposition, etc.) to excise guardrails while preserving reasoning. 9 CLI methods, 28 analysis modules, 116 model presets ac… | `mlops/inference/obliteratus` |
|
||||
| `outlines` | Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize inference speed with Outlines - dottxt.ai's structured generation library | `mlops/inference/outlines` |
|
||||
| `serving-llms-vllm` | Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), an… | `mlops/inference/vllm` |
|
||||
| `tensorrt-llm` | Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantization (FP8/INT4), in-flight batching, and mult… | `mlops/inference/tensorrt-llm` |
|
||||
|
||||
## mlops/models
|
||||
|
||||
Specific model architectures and tools — computer vision (CLIP, SAM, Stable Diffusion), speech (Whisper), audio generation (AudioCraft), and multimodal models (LLaVA).
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `audiocraft-audio-generation` | PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text descriptions, create sound effects, or perform melody-conditioned music generation. | `mlops/models/audiocraft` |
|
||||
| `clip` | OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for image search, content moderation, or vision-language tasks without fine-tuning. Best for general-purpo… | `mlops/models/clip` |
|
||||
| `llava` | Large Language and Vision Assistant. Enables visual instruction tuning and image-based conversations. Combines CLIP vision encoder with Vicuna/LLaMA language models. Supports multi-turn image chat, visual question answering, and instruction following. Use for vision-language cha… | `mlops/models/llava` |
|
||||
| `segment-anything-model` | Foundation model for image segmentation with zero-shot transfer. Use when you need to segment any object in images using points, boxes, or masks as prompts, or automatically generate all object masks in an image. | `mlops/models/segment-anything` |
|
||||
| `stable-diffusion-image-generation` | State-of-the-art text-to-image generation with Stable Diffusion models via HuggingFace Diffusers. Use when generating images from text prompts, performing image-to-image translation, inpainting, or building custom diffusion pipelines. | `mlops/models/stable-diffusion` |
|
||||
| `whisper` | OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio proc… | `mlops/models/whisper` |
|
||||
|
||||
## mlops/research
|
||||
|
||||
ML research frameworks for building and optimizing AI systems with declarative programming.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `dspy` | Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's framework for systematic LM programming | `mlops/research/dspy` |
|
||||
|
||||
## mlops/training
|
||||
|
||||
Fine-tuning, RLHF/DPO/GRPO training, distributed training frameworks, and optimization tools for training LLMs and other models.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `axolotl` | Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support | `mlops/training/axolotl` |
|
||||
| `distributed-llm-pretraining-torchtitan` | Provides PyTorch-native distributed LLM pretraining using torchtitan with 4D parallelism (FSDP2, TP, PP, CP). Use when pretraining Llama 3.1, DeepSeek V3, or custom models at scale from 8 to 512+ GPUs with Float8, torch.compile, and distributed checkpointing. | `mlops/training/torchtitan` |
|
||||
| `fine-tuning-with-trl` | Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Tr… | `mlops/training/trl-fine-tuning` |
|
||||
| `grpo-rl-training` | Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training | `mlops/training/grpo-rl-training` |
|
||||
| `hermes-atropos-environments` | Build, test, and debug Hermes Agent RL environments for Atropos training. Covers the HermesAgentBaseEnv interface, reward functions, agent loop integration, evaluation with tools, wandb logging, and the three CLI modes (serve/process/evaluate). Use when creating, reviewing, or f… | `mlops/training/hermes-atropos-environments` |
|
||||
| `huggingface-accelerate` | Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch command. HuggingFace ecosystem standard. | `mlops/training/accelerate` |
|
||||
| `optimizing-attention-flash` | Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory issues with attention, or need faster inference. Supports PyTorch native SDPA,… | `mlops/training/flash-attention` |
|
||||
| `peft-fine-tuning` | Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with minimal accuracy loss, or for multi-adapter serving. HuggingFace's official library i… | `mlops/training/peft` |
|
||||
| `pytorch-fsdp` | Expert guidance for Fully Sharded Data Parallel training with PyTorch FSDP - parameter sharding, mixed precision, CPU offloading, FSDP2 | `mlops/training/pytorch-fsdp` |
|
||||
| `pytorch-lightning` | High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from laptop to supercomputer with same code. Use when you want clean training loops with built-in best practices. | `mlops/training/pytorch-lightning` |
|
||||
| `simpo-training` | Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model needed, more efficient than DPO. Use for preference alignment when want simpler, faster training than DPO/PPO. | `mlops/training/simpo` |
|
||||
| `slime-rl-training` | Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling. | `mlops/training/slime` |
|
||||
| `unsloth` | Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization | `mlops/training/unsloth` |
|
||||
|
||||
## mlops/vector-databases
|
||||
|
||||
Vector similarity search and embedding databases for RAG, semantic search, and AI application backends.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `chroma` | Open-source embedding database for AI applications. Store embeddings and metadata, perform vector and full-text search, filter by metadata. Simple 4-function API. Scales from notebooks to production clusters. Use for semantic search, RAG applications, or document retrieval. Best… | `mlops/vector-databases/chroma` |
|
||||
| `faiss` | Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). Use for fast k-NN search, large-scale vector retrieval, or when you need pure similarity search without… | `mlops/vector-databases/faiss` |
|
||||
| `pinecone` | Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for server… | `mlops/vector-databases/pinecone` |
|
||||
| `qdrant-vector-search` | High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or scalable vector storage with Rust-powered performance. | `mlops/vector-databases/qdrant` |
|
||||
|
||||
## note-taking
|
||||
|
||||
Note taking skills, to save information, assist with research, and collab on multi-session planning and information sharing.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `obsidian` | Read, search, and create notes in the Obsidian vault. | `note-taking/obsidian` |
|
||||
|
||||
## productivity
|
||||
|
||||
Skills for document creation, presentations, spreadsheets, and other productivity workflows.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `google-workspace` | Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration via Python. Uses OAuth2 with automatic token refresh. No external binaries needed — runs entirely with Google's Python client libraries in the Hermes venv. | `productivity/google-workspace` |
|
||||
| `linear` | Manage Linear issues, projects, and teams via the GraphQL API. Create, update, search, and organize issues. | `productivity/linear` |
|
||||
| `nano-pdf` | Edit PDFs with natural-language instructions using the nano-pdf CLI. Modify text, fix typos, update titles, and make content changes to specific pages without manual editing. | `productivity/nano-pdf` |
|
||||
| `notion` | Notion API for creating and managing pages, databases, and blocks via curl. Search, create, update, and query Notion workspaces directly from the terminal. | `productivity/notion` |
|
||||
| `ocr-and-documents` | Extract text from PDFs and scanned documents. Use web_extract for remote URLs, pymupdf for local text-based PDFs, marker-pdf for OCR/scanned docs. For DOCX use python-docx, for PPTX see the powerpoint skill. | `productivity/ocr-and-documents` |
|
||||
| `powerpoint` | "Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted content will be used elsewhere, like in a… | `productivity/powerpoint` |
|
||||
|
||||
## research
|
||||
|
||||
Skills for academic research, paper discovery, literature review, domain reconnaissance, market data, content monitoring, and scientific knowledge retrieval.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `arxiv` | Search and retrieve academic papers from arXiv using their free REST API. No API key needed. Search by keyword, author, category, or ID. Combine with web_extract or the ocr-and-documents skill to read full paper content. | `research/arxiv` |
|
||||
| `blogwatcher` | Monitor blogs and RSS/Atom feeds for updates using the blogwatcher CLI. Add blogs, scan for new articles, and track what you've read. | `research/blogwatcher` |
|
||||
| `domain-intel` | Passive domain reconnaissance using Python stdlib. Subdomain discovery, SSL certificate inspection, WHOIS lookups, DNS records, domain availability checks, and bulk multi-domain analysis. No API keys required. | `research/domain-intel` |
|
||||
| `duckduckgo-search` | Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Use the Python DDGS library or CLI to search, then web_extract for full content. | `research/duckduckgo-search` |
|
||||
| `parallel-cli` | Optional vendor skill for Parallel CLI — agent-native web search, extraction, deep research, enrichment, FindAll, and monitoring. | `research/parallel-cli` |
|
||||
| `ml-paper-writing` | Write publication-ready ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Use when drafting papers from research repos, structuring arguments, verifying citations, or preparing camera-ready submissions. Includes LaTeX templates, reviewer guidelines, and citation verificatio… | `research/ml-paper-writing` |
|
||||
| `polymarket` | Query Polymarket prediction market data — search markets, get prices, orderbooks, and price history. Read-only via public REST APIs, no API key needed. | `research/polymarket` |
|
||||
|
||||
## smart-home
|
||||
|
||||
Skills for controlling smart home devices — lights, switches, sensors, and home automation systems.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `openhue` | Control Philips Hue lights, rooms, and scenes via the OpenHue CLI. Turn lights on/off, adjust brightness, color, color temperature, and activate scenes. | `smart-home/openhue` |
|
||||
|
||||
## social-media
|
||||
|
||||
Skills for interacting with social platforms — posting, reading, monitoring, and account operations.
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `xitter` | Interact with X/Twitter via the x-cli terminal client using official X API credentials. | `social-media/xitter` |
|
||||
|
||||
## software-development
|
||||
|
||||
| Skill | Description | Path |
|
||||
|-------|-------------|------|
|
||||
| `code-review` | Guidelines for performing thorough code reviews with security and quality focus | `software-development/code-review` |
|
||||
| `plan` | Plan mode for Hermes — inspect context, write a markdown plan into `.hermes/plans/` in the active workspace/backend working directory, and do not execute the work. | `software-development/plan` |
|
||||
| `requesting-code-review` | Use when completing tasks, implementing major features, or before merging. Validates work meets requirements through systematic review process. | `software-development/requesting-code-review` |
|
||||
| `subagent-driven-development` | Use when executing implementation plans with independent tasks. Dispatches fresh delegate_task per task with two-stage review (spec compliance then code quality). | `software-development/subagent-driven-development` |
|
||||
| `systematic-debugging` | Use when encountering any bug, test failure, or unexpected behavior. 4-phase root cause investigation — NO fixes without understanding the problem first. | `software-development/systematic-debugging` |
|
||||
| `test-driven-development` | Use when implementing any feature or bugfix, before writing implementation code. Enforces RED-GREEN-REFACTOR cycle with test-first approach. | `software-development/test-driven-development` |
|
||||
| `writing-plans` | Use when you have a spec or requirements for a multi-step task. Creates comprehensive implementation plans with bite-sized tasks, exact file paths, and complete code examples. | `software-development/writing-plans` |
|
||||
131
hermes_code/website/docs/reference/slash-commands.md
Normal file
131
hermes_code/website/docs/reference/slash-commands.md
Normal file
|
|
@ -0,0 +1,131 @@
|
|||
---
|
||||
sidebar_position: 2
|
||||
title: "Slash Commands Reference"
|
||||
description: "Complete reference for interactive CLI and messaging slash commands"
|
||||
---
|
||||
|
||||
# Slash Commands Reference
|
||||
|
||||
Hermes has two slash-command surfaces, both driven by a central `COMMAND_REGISTRY` in `hermes_cli/commands.py`:
|
||||
|
||||
- **Interactive CLI slash commands** — dispatched by `cli.py`, with autocomplete from the registry
|
||||
- **Messaging slash commands** — dispatched by `gateway/run.py`, with help text and platform menus generated from the registry
|
||||
|
||||
Installed skills are also exposed as dynamic slash commands on both surfaces. That includes bundled skills like `/plan`, which opens plan mode and saves markdown plans under `.hermes/plans/` relative to the active workspace/backend working directory.
|
||||
|
||||
## Interactive CLI slash commands
|
||||
|
||||
Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-insensitive.
|
||||
|
||||
### Session
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/new` (alias: `/reset`) | Start a new session (fresh session ID + history) |
|
||||
| `/clear` | Clear screen and start a new session |
|
||||
| `/history` | Show conversation history |
|
||||
| `/save` | Save the current conversation |
|
||||
| `/retry` | Retry the last message (resend to agent) |
|
||||
| `/undo` | Remove the last user/assistant exchange |
|
||||
| `/title` | Set a title for the current session (usage: /title My Session Name) |
|
||||
| `/compress` | Manually compress conversation context (flush memories + summarize) |
|
||||
| `/rollback` | List or restore filesystem checkpoints (usage: /rollback [number]) |
|
||||
| `/stop` | Kill all running background processes |
|
||||
| `/queue <prompt>` (alias: `/q`) | Queue a prompt for the next turn (doesn't interrupt the current agent response) |
|
||||
| `/resume [name]` | Resume a previously-named session |
|
||||
| `/statusbar` (alias: `/sb`) | Toggle the context/model status bar on or off |
|
||||
| `/background <prompt>` | Run a prompt in a separate background session. The agent processes your prompt independently — your current session stays free for other work. Results appear as a panel when the task finishes. See [CLI Background Sessions](/docs/user-guide/cli#background-sessions). |
|
||||
| `/plan [request]` | Load the bundled `plan` skill to write a markdown plan instead of executing the work. Plans are saved under `.hermes/plans/` relative to the active workspace/backend working directory. |
|
||||
|
||||
### Configuration
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/config` | Show current configuration |
|
||||
| `/model [model-name]` | Show or change the current model. Supports: `/model claude-sonnet-4`, `/model provider:model` (switch providers), `/model custom:model` (custom endpoint), `/model custom:name:model` (named custom provider), `/model custom` (auto-detect from endpoint) |
|
||||
| `/provider` | Show available providers and current provider |
|
||||
| `/prompt` | View/set custom system prompt |
|
||||
| `/personality` | Set a predefined personality |
|
||||
| `/verbose` | Cycle tool progress display: off → new → all → verbose |
|
||||
| `/reasoning` | Manage reasoning effort and display (usage: /reasoning [level\|show\|hide]) |
|
||||
| `/skin` | Show or change the display skin/theme |
|
||||
| `/voice [on\|off\|tts\|status]` | Toggle CLI voice mode and spoken playback. Recording uses `voice.record_key` (default: `Ctrl+B`). |
|
||||
|
||||
### Tools & Skills
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/tools [list\|disable\|enable] [name...]` | Manage tools: list available tools, or disable/enable specific tools for the current session. Disabling a tool removes it from the agent's toolset and triggers a session reset. |
|
||||
| `/toolsets` | List available toolsets |
|
||||
| `/browser [connect\|disconnect\|status]` | Manage local Chrome CDP connection. `connect` attaches browser tools to a running Chrome instance (default: `ws://localhost:9222`). `disconnect` detaches. `status` shows current connection. Auto-launches Chrome if no debugger is detected. |
|
||||
| `/skills` | Search, install, inspect, or manage skills from online registries |
|
||||
| `/cron` | Manage scheduled tasks (list, add/create, edit, pause, resume, run, remove) |
|
||||
| `/reload-mcp` | Reload MCP servers from config.yaml |
|
||||
| `/plugins` | List installed plugins and their status |
|
||||
|
||||
### Info
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/help` | Show this help message |
|
||||
| `/usage` | Show token usage, cost breakdown, and session duration |
|
||||
| `/insights` | Show usage insights and analytics (last 30 days) |
|
||||
| `/platforms` | Show gateway/messaging platform status |
|
||||
| `/paste` | Check clipboard for an image and attach it |
|
||||
|
||||
### Exit
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/quit` | Exit the CLI (also: /exit, /q) |
|
||||
|
||||
### Dynamic CLI slash commands
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/<skill-name>` | Load any installed skill as an on-demand command. Example: `/gif-search`, `/github-pr-workflow`, `/excalidraw`. |
|
||||
| `/skills ...` | Search, browse, inspect, install, audit, publish, and configure skills from registries and the official optional-skills catalog. |
|
||||
|
||||
### Quick commands
|
||||
|
||||
User-defined quick commands from `quick_commands` in `~/.hermes/config.yaml` are also available as slash commands. These are resolved at dispatch time, not shown in the built-in autocomplete/help tables.
|
||||
|
||||
## Messaging slash commands
|
||||
|
||||
The messaging gateway supports the following built-in commands inside Telegram, Discord, Slack, WhatsApp, Signal, Email, and Home Assistant chats:
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/new` | Start a new conversation. |
|
||||
| `/reset` | Reset conversation history. |
|
||||
| `/status` | Show session info. |
|
||||
| `/stop` | Kill all running background processes and interrupt the running agent. |
|
||||
| `/model [provider:model]` | Show or change the model. Supports provider switches (`/model zai:glm-5`), custom endpoints (`/model custom:model`), named custom providers (`/model custom:local:qwen`), and auto-detect (`/model custom`). |
|
||||
| `/provider` | Show provider availability and auth status. |
|
||||
| `/personality [name]` | Set a personality overlay for the session. |
|
||||
| `/retry` | Retry the last message. |
|
||||
| `/undo` | Remove the last exchange. |
|
||||
| `/sethome` | Mark the current chat as the platform home channel for deliveries. |
|
||||
| `/compress` | Manually compress conversation context. |
|
||||
| `/title [name]` | Set or show the session title. |
|
||||
| `/resume [name]` | Resume a previously named session. |
|
||||
| `/usage` | Show token usage, estimated cost breakdown (input/output), context window state, and session duration. |
|
||||
| `/insights [days]` | Show usage analytics. |
|
||||
| `/reasoning [level\|show\|hide]` | Change reasoning effort or toggle reasoning display. |
|
||||
| `/voice [on\|off\|tts\|join\|channel\|leave\|status]` | Control spoken replies in chat. `join`/`channel`/`leave` manage Discord voice-channel mode. |
|
||||
| `/rollback [number]` | List or restore filesystem checkpoints. |
|
||||
| `/background <prompt>` | Run a prompt in a separate background session. Results are delivered back to the same chat when the task finishes. See [Messaging Background Sessions](/docs/user-guide/messaging/#background-sessions). |
|
||||
| `/plan [request]` | Load the bundled `plan` skill to write a markdown plan instead of executing the work. Plans are saved under `.hermes/plans/` relative to the active workspace/backend working directory. |
|
||||
| `/reload-mcp` | Reload MCP servers from config. |
|
||||
| `/approve [session\|always]` | Approve and execute a pending dangerous command. `session` approves for this session only; `always` adds to permanent allowlist. |
|
||||
| `/deny` | Reject a pending dangerous command. |
|
||||
| `/update` | Update Hermes Agent to the latest version. |
|
||||
| `/help` | Show messaging help. |
|
||||
| `/<skill-name>` | Invoke any installed skill by name. |
|
||||
|
||||
## Notes
|
||||
|
||||
- `/skin`, `/tools`, `/toolsets`, `/browser`, `/config`, `/prompt`, `/cron`, `/skills`, `/platforms`, `/paste`, `/verbose`, `/statusbar`, and `/plugins` are **CLI-only** commands.
|
||||
- `/status`, `/sethome`, `/update`, `/approve`, and `/deny` are **messaging-only** commands.
|
||||
- `/background`, `/voice`, `/reload-mcp`, and `/rollback` work in **both** the CLI and the messaging gateway.
|
||||
- `/voice join`, `/voice channel`, and `/voice leave` are only meaningful on Discord.
|
||||
163
hermes_code/website/docs/reference/tools-reference.md
Normal file
163
hermes_code/website/docs/reference/tools-reference.md
Normal file
|
|
@ -0,0 +1,163 @@
|
|||
---
|
||||
sidebar_position: 3
|
||||
title: "Built-in Tools Reference"
|
||||
description: "Authoritative reference for Hermes built-in tools, grouped by toolset"
|
||||
---
|
||||
|
||||
# Built-in Tools Reference
|
||||
|
||||
This page documents the built-in Hermes tool registry as it exists in code. Availability can still vary by platform, credentials, and enabled toolsets.
|
||||
|
||||
## `browser` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `browser_back` | Navigate back to the previous page in browser history. Requires browser_navigate to be called first. | — |
|
||||
| `browser_click` | Click on an element identified by its ref ID from the snapshot (e.g., '@e5'). The ref IDs are shown in square brackets in the snapshot output. Requires browser_navigate and browser_snapshot to be called first. | — |
|
||||
| `browser_close` | Close the browser session and release resources. Call this when done with browser tasks to free up Browserbase session quota. | — |
|
||||
| `browser_console` | Get browser console output and JavaScript errors from the current page. Returns console.log/warn/error/info messages and uncaught JS exceptions. Use this to detect silent JavaScript errors, failed API calls, and application warnings. Requi… | — |
|
||||
| `browser_get_images` | Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first. | — |
|
||||
| `browser_navigate` | Navigate to a URL in the browser. Initializes the session and loads the page. Must be called before other browser tools. For simple information retrieval, prefer web_search or web_extract (faster, cheaper). Use browser tools when you need… | — |
|
||||
| `browser_press` | Press a keyboard key. Useful for submitting forms (Enter), navigating (Tab), or keyboard shortcuts. Requires browser_navigate to be called first. | — |
|
||||
| `browser_scroll` | Scroll the page in a direction. Use this to reveal more content that may be below or above the current viewport. Requires browser_navigate to be called first. | — |
|
||||
| `browser_snapshot` | Get a text-based snapshot of the current page's accessibility tree. Returns interactive elements with ref IDs (like @e1, @e2) for browser_click and browser_type. full=false (default): compact view with interactive elements. full=true: comp… | — |
|
||||
| `browser_type` | Type text into an input field identified by its ref ID. Clears the field first, then types the new text. Requires browser_navigate and browser_snapshot to be called first. | — |
|
||||
| `browser_vision` | Take a screenshot of the current page and analyze it with vision AI. Use this when you need to visually understand what's on the page - especially useful for CAPTCHAs, visual verification challenges, complex layouts, or when the text snaps… | — |
|
||||
|
||||
## `clarify` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `clarify` | Ask the user a question when you need clarification, feedback, or a decision before proceeding. Supports two modes: 1. **Multiple choice** — provide up to 4 choices. The user picks one or types their own answer via a 5th 'Other' option. 2.… | — |
|
||||
|
||||
## `code_execution` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `execute_code` | Run a Python script that can call Hermes tools programmatically. Use this when you need 3+ tool calls with processing logic between them, need to filter/reduce large tool outputs before they enter your context, need conditional branching (… | — |
|
||||
|
||||
## `cronjob` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `cronjob` | Unified scheduled-task manager. Use `action="create"`, `"list"`, `"update"`, `"pause"`, `"resume"`, `"run"`, or `"remove"` to manage jobs. Supports skill-backed jobs with one or more attached skills, and `skills=[]` on update clears attached skills. Cron runs happen in fresh sessions with no current-chat context. | — |
|
||||
|
||||
## `delegation` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `delegate_task` | Spawn one or more subagents to work on tasks in isolated contexts. Each subagent gets its own conversation, terminal session, and toolset. Only the final summary is returned -- intermediate tool results never enter your context window. TWO… | — |
|
||||
|
||||
## `file` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `patch` | Targeted find-and-replace edits in files. Use this instead of sed/awk in terminal. Uses fuzzy matching (9 strategies) so minor whitespace/indentation differences won't break it. Returns a unified diff. Auto-runs syntax checks after editing… | — |
|
||||
| `read_file` | Read a text file with line numbers and pagination. Use this instead of cat/head/tail in terminal. Output format: 'LINE_NUM\|CONTENT'. Suggests similar filenames if not found. Use offset and limit for large files. NOTE: Cannot read images o… | — |
|
||||
| `search_files` | Search file contents or find files by name. Use this instead of grep/rg/find/ls in terminal. Ripgrep-backed, faster than shell equivalents. Content search (target='content'): Regex search inside files. Output modes: full matches with line… | — |
|
||||
| `write_file` | Write content to a file, completely replacing existing content. Use this instead of echo/cat heredoc in terminal. Creates parent directories automatically. OVERWRITES the entire file — use 'patch' for targeted edits. | — |
|
||||
|
||||
## `homeassistant` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `ha_call_service` | Call a Home Assistant service to control a device. Use ha_list_services to discover available services and their parameters for each domain. | — |
|
||||
| `ha_get_state` | Get the detailed state of a single Home Assistant entity, including all attributes (brightness, color, temperature setpoint, sensor readings, etc.). | — |
|
||||
| `ha_list_entities` | List Home Assistant entities. Optionally filter by domain (light, switch, climate, sensor, binary_sensor, cover, fan, etc.) or by area name (living room, kitchen, bedroom, etc.). | — |
|
||||
| `ha_list_services` | List available Home Assistant services (actions) for device control. Shows what actions can be performed on each device type and what parameters they accept. Use this to discover how to control devices found via ha_list_entities. | — |
|
||||
|
||||
## `honcho` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `honcho_conclude` | Write a conclusion about the user back to Honcho's memory. Conclusions are persistent facts that build the user's profile — preferences, corrections, clarifications, project context, or anything the user tells you that should be remembered… | — |
|
||||
| `honcho_context` | Ask Honcho a natural language question and get a synthesized answer. Uses Honcho's LLM (dialectic reasoning) — higher cost than honcho_profile or honcho_search. Can query about any peer: the user (default), the AI assistant, or any named p… | — |
|
||||
| `honcho_profile` | Retrieve the user's peer card from Honcho — a curated list of key facts about them (name, role, preferences, communication style, patterns). Fast, no LLM reasoning, minimal cost. Use this at conversation start or when you need a quick fact… | — |
|
||||
| `honcho_search` | Semantic search over Honcho's stored context about the user. Returns raw excerpts ranked by relevance to your query — no LLM synthesis. Cheaper and faster than honcho_context. Good when you want to find specific past facts and reason over… | — |
|
||||
|
||||
## `image_gen` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `image_generate` | Generate high-quality images from text prompts using FLUX 2 Pro model with automatic 2x upscaling. Creates detailed, artistic images that are automatically upscaled for hi-rez results. Returns a single upscaled image URL. Display it using… | FAL_KEY |
|
||||
|
||||
## `memory` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `memory` | Save important information to persistent memory that survives across sessions. Your memory appears in your system prompt at session start -- it's how you remember things about the user and your environment between conversations. WHEN TO SA… | — |
|
||||
|
||||
## `messaging` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `send_message` | Send a message to a connected messaging platform, or list available targets. IMPORTANT: When the user asks to send to a specific channel or person (not just a bare platform name), call send_message(action='list') FIRST to see available tar… | — |
|
||||
|
||||
## `moa` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `mixture_of_agents` | Route a hard problem through multiple frontier LLMs collaboratively. Makes 5 API calls (4 reference models + 1 aggregator) with maximum reasoning effort — use sparingly for genuinely difficult problems. Best for: complex math, advanced alg… | OPENROUTER_API_KEY |
|
||||
|
||||
## `rl` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `rl_check_status` | Get status and metrics for a training run. RATE LIMITED: enforces 30-minute minimum between checks for the same run. Returns WandB metrics: step, state, reward_mean, loss, percent_correct. | TINKER_API_KEY, WANDB_API_KEY |
|
||||
| `rl_edit_config` | Update a configuration field. Use rl_get_current_config() first to see all available fields for the selected environment. Each environment has different configurable options. Infrastructure settings (tokenizer, URLs, lora_rank, learning_ra… | TINKER_API_KEY, WANDB_API_KEY |
|
||||
| `rl_get_current_config` | Get the current environment configuration. Returns only fields that can be modified: group_size, max_token_length, total_steps, steps_per_eval, use_wandb, wandb_name, max_num_workers. | TINKER_API_KEY, WANDB_API_KEY |
|
||||
| `rl_get_results` | Get final results and metrics for a completed training run. Returns final metrics and path to trained weights. | TINKER_API_KEY, WANDB_API_KEY |
|
||||
| `rl_list_environments` | List all available RL environments. Returns environment names, paths, and descriptions. TIP: Read the file_path with file tools to understand how each environment works (verifiers, data loading, rewards). | TINKER_API_KEY, WANDB_API_KEY |
|
||||
| `rl_list_runs` | List all training runs (active and completed) with their status. | TINKER_API_KEY, WANDB_API_KEY |
|
||||
| `rl_select_environment` | Select an RL environment for training. Loads the environment's default configuration. After selecting, use rl_get_current_config() to see settings and rl_edit_config() to modify them. | TINKER_API_KEY, WANDB_API_KEY |
|
||||
| `rl_start_training` | Start a new RL training run with the current environment and config. Most training parameters (lora_rank, learning_rate, etc.) are fixed. Use rl_edit_config() to set group_size, batch_size, wandb_project before starting. WARNING: Training… | TINKER_API_KEY, WANDB_API_KEY |
|
||||
| `rl_stop_training` | Stop a running training job. Use if metrics look bad, training is stagnant, or you want to try different settings. | TINKER_API_KEY, WANDB_API_KEY |
|
||||
| `rl_test_inference` | Quick inference test for any environment. Runs a few steps of inference + scoring using OpenRouter. Default: 3 steps x 16 completions = 48 rollouts per model, testing 3 models = 144 total. Tests environment loading, prompt construction, in… | TINKER_API_KEY, WANDB_API_KEY |
|
||||
|
||||
## `session_search` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `session_search` | Search your long-term memory of past conversations. This is your recall -- every past session is searchable, and this tool summarizes what happened. USE THIS PROACTIVELY when: - The user says 'we did this before', 'remember when', 'last ti… | — |
|
||||
|
||||
## `skills` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `skill_manage` | Manage skills (create, update, delete). Skills are your procedural memory — reusable approaches for recurring task types. New skills go to ~/.hermes/skills/; existing skills can be modified wherever they live. Actions: create (full SKILL.m… | — |
|
||||
| `skill_view` | Skills allow for loading information about specific tasks and workflows, as well as scripts and templates. Load a skill's full content or access its linked files (references, templates, scripts). First call returns SKILL.md content plus a… | — |
|
||||
| `skills_list` | List available skills (name + description). Use skill_view(name) to load full content. | — |
|
||||
|
||||
## `terminal` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `process` | Manage background processes started with terminal(background=true). Actions: 'list' (show all), 'poll' (check status + new output), 'log' (full output with pagination), 'wait' (block until done or timeout), 'kill' (terminate), 'write' (sen… | — |
|
||||
| `terminal` | Execute shell commands on a Linux environment. Filesystem persists between calls. Do NOT use cat/head/tail to read files — use read_file instead. Do NOT use grep/rg/find to search — use search_files instead. Do NOT use ls to list directori… | — |
|
||||
|
||||
## `todo` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `todo` | Manage your task list for the current session. Use for complex tasks with 3+ steps or when the user provides multiple tasks. Call with no parameters to read the current list. Writing: - Provide 'todos' array to create/update items - merge=… | — |
|
||||
|
||||
## `vision` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `vision_analyze` | Analyze images using AI vision. Provides a comprehensive description and answers a specific question about the image content. | — |
|
||||
|
||||
## `web` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `web_search` | Search the web for information on any topic. Returns up to 5 relevant results with titles, URLs, and descriptions. | PARALLEL_API_KEY or FIRECRAWL_API_KEY or TAVILY_API_KEY |
|
||||
| `web_extract` | Extract content from web page URLs. Returns page content in markdown format. Also works with PDF URLs — pass the PDF link directly and it converts to markdown text. Pages under 5000 chars return full markdown; larger pages are LLM-summarized. | PARALLEL_API_KEY or FIRECRAWL_API_KEY or TAVILY_API_KEY |
|
||||
|
||||
## `tts` toolset
|
||||
|
||||
| Tool | Description | Requires environment |
|
||||
|------|-------------|----------------------|
|
||||
| `text_to_speech` | Convert text to speech audio. Returns a MEDIA: path that the platform delivers as a voice message. On Telegram it plays as a voice bubble, on Discord/WhatsApp as an audio attachment. In CLI mode, saves to ~/voice-memos/. Voice and provider… | — |
|
||||
|
||||
|
||||
52
hermes_code/website/docs/reference/toolsets-reference.md
Normal file
52
hermes_code/website/docs/reference/toolsets-reference.md
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
---
|
||||
sidebar_position: 4
|
||||
title: "Toolsets Reference"
|
||||
description: "Reference for Hermes core, composite, platform, and dynamic toolsets"
|
||||
---
|
||||
|
||||
# Toolsets Reference
|
||||
|
||||
Toolsets are named bundles of tools that you can enable with `hermes chat --toolsets ...`, configure per platform, or resolve inside the agent runtime.
|
||||
|
||||
| Toolset | Kind | Resolves to |
|
||||
|---------|------|-------------|
|
||||
| `browser` | core | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` |
|
||||
| `clarify` | core | `clarify` |
|
||||
| `code_execution` | core | `execute_code` |
|
||||
| `cronjob` | core | `cronjob` |
|
||||
| `debugging` | composite | `patch`, `process`, `read_file`, `search_files`, `terminal`, `web_extract`, `web_search`, `write_file` |
|
||||
| `delegation` | core | `delegate_task` |
|
||||
| `file` | core | `patch`, `read_file`, `search_files`, `write_file` |
|
||||
| `hermes-acp` | platform | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `delegate_task`, `execute_code`, `memory`, `patch`, `process`, `read_file`, `search_files`, `session_search`, `skill_manage`, `skill_view`, `skills_list`, `terminal`, `todo`, `vision_analyze`, `web_extract`, `web_search`, `write_file` |
|
||||
| `hermes-cli` | platform | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `clarify`, `cronjob`, `delegate_task`, `execute_code`, `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services`, `honcho_conclude`, `honcho_context`, `honcho_profile`, `honcho_search`, `image_generate`, `memory`, `mixture_of_agents`, `patch`, `process`, `read_file`, `search_files`, `send_message`, `session_search`, `skill_manage`, `skill_view`, `skills_list`, `terminal`, `text_to_speech`, `todo`, `vision_analyze`, `web_extract`, `web_search`, `write_file` |
|
||||
| `hermes-discord` | platform | _(same as hermes-cli)_ |
|
||||
| `hermes-email` | platform | _(same as hermes-cli)_ |
|
||||
| `hermes-gateway` | composite | Union of all messaging platform toolsets |
|
||||
| `hermes-homeassistant` | platform | _(same as hermes-cli)_ |
|
||||
| `hermes-signal` | platform | _(same as hermes-cli)_ |
|
||||
| `hermes-slack` | platform | _(same as hermes-cli)_ |
|
||||
| `hermes-sms` | platform | _(same as hermes-cli)_ |
|
||||
| `hermes-telegram` | platform | _(same as hermes-cli)_ |
|
||||
| `hermes-whatsapp` | platform | _(same as hermes-cli)_ |
|
||||
| `homeassistant` | core | `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services` |
|
||||
| `honcho` | core | `honcho_conclude`, `honcho_context`, `honcho_profile`, `honcho_search` |
|
||||
| `image_gen` | core | `image_generate` |
|
||||
| `memory` | core | `memory` |
|
||||
| `messaging` | core | `send_message` |
|
||||
| `moa` | core | `mixture_of_agents` |
|
||||
| `rl` | core | `rl_check_status`, `rl_edit_config`, `rl_get_current_config`, `rl_get_results`, `rl_list_environments`, `rl_list_runs`, `rl_select_environment`, `rl_start_training`, `rl_stop_training`, `rl_test_inference` |
|
||||
| `safe` | composite | `image_generate`, `mixture_of_agents`, `vision_analyze`, `web_extract`, `web_search` |
|
||||
| `search` | core | `web_search` |
|
||||
| `session_search` | core | `session_search` |
|
||||
| `skills` | core | `skill_manage`, `skill_view`, `skills_list` |
|
||||
| `terminal` | core | `process`, `terminal` |
|
||||
| `todo` | core | `todo` |
|
||||
| `tts` | core | `text_to_speech` |
|
||||
| `vision` | core | `vision_analyze` |
|
||||
| `web` | core | `web_extract`, `web_search` |
|
||||
|
||||
## Dynamic toolsets
|
||||
|
||||
- `mcp-<server>` — generated at runtime for each configured MCP server.
|
||||
- Custom toolsets can be created in configuration and resolved at startup.
|
||||
- Wildcards: `all` and `*` expand to every registered toolset.
|
||||
8
hermes_code/website/docs/user-guide/_category_.json
Normal file
8
hermes_code/website/docs/user-guide/_category_.json
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
{
|
||||
"label": "User Guide",
|
||||
"position": 2,
|
||||
"link": {
|
||||
"type": "generated-index",
|
||||
"description": "Learn how to use Hermes Agent effectively."
|
||||
}
|
||||
}
|
||||
203
hermes_code/website/docs/user-guide/checkpoints-and-rollback.md
Normal file
203
hermes_code/website/docs/user-guide/checkpoints-and-rollback.md
Normal file
|
|
@ -0,0 +1,203 @@
|
|||
---
|
||||
sidebar_position: 8
|
||||
title: "Checkpoints and /rollback"
|
||||
description: "Filesystem safety nets for destructive operations using shadow git repos and automatic snapshots"
|
||||
---
|
||||
|
||||
# Checkpoints and `/rollback`
|
||||
|
||||
Hermes Agent automatically snapshots your project before **destructive operations** and lets you restore it with a single command. Checkpoints are **enabled by default** — there's zero cost when no file-mutating tools fire.
|
||||
|
||||
This safety net is powered by an internal **Checkpoint Manager** that keeps a separate shadow git repository under `~/.hermes/checkpoints/` — your real project `.git` is never touched.
|
||||
|
||||
## What Triggers a Checkpoint
|
||||
|
||||
Checkpoints are taken automatically before:
|
||||
|
||||
- **File tools** — `write_file` and `patch`
|
||||
- **Destructive terminal commands** — `rm`, `mv`, `sed -i`, `truncate`, `shred`, output redirects (`>`), and `git reset`/`clean`/`checkout`
|
||||
|
||||
The agent creates **at most one checkpoint per directory per turn**, so long-running sessions don't spam snapshots.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/rollback` | List all checkpoints with change stats |
|
||||
| `/rollback <N>` | Restore to checkpoint N (also undoes last chat turn) |
|
||||
| `/rollback diff <N>` | Preview diff between checkpoint N and current state |
|
||||
| `/rollback <N> <file>` | Restore a single file from checkpoint N |
|
||||
|
||||
## How Checkpoints Work
|
||||
|
||||
At a high level:
|
||||
|
||||
- Hermes detects when tools are about to **modify files** in your working tree.
|
||||
- Once per conversation turn (per directory), it:
|
||||
- Resolves a reasonable project root for the file.
|
||||
- Initialises or reuses a **shadow git repo** tied to that directory.
|
||||
- Stages and commits the current state with a short, human‑readable reason.
|
||||
- These commits form a checkpoint history that you can inspect and restore via `/rollback`.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
user["User command\n(hermes, gateway)"]
|
||||
agent["AIAgent\n(run_agent.py)"]
|
||||
tools["File & terminal tools"]
|
||||
cpMgr["CheckpointManager"]
|
||||
shadowRepo["Shadow git repo\n~/.hermes/checkpoints/<hash>"]
|
||||
|
||||
user --> agent
|
||||
agent -->|"tool call"| tools
|
||||
tools -->|"before mutate\nensure_checkpoint()"| cpMgr
|
||||
cpMgr -->|"git add/commit"| shadowRepo
|
||||
cpMgr -->|"OK / skipped"| tools
|
||||
tools -->|"apply changes"| agent
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Checkpoints are enabled by default. Configure in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
checkpoints:
|
||||
enabled: true # master switch (default: true)
|
||||
max_snapshots: 50 # max checkpoints per directory
|
||||
```
|
||||
|
||||
To disable:
|
||||
|
||||
```yaml
|
||||
checkpoints:
|
||||
enabled: false
|
||||
```
|
||||
|
||||
When disabled, the Checkpoint Manager is a no‑op and never attempts git operations.
|
||||
|
||||
## Listing Checkpoints
|
||||
|
||||
From a CLI session:
|
||||
|
||||
```
|
||||
/rollback
|
||||
```
|
||||
|
||||
Hermes responds with a formatted list showing change statistics:
|
||||
|
||||
```text
|
||||
📸 Checkpoints for /path/to/project:
|
||||
|
||||
1. 4270a8c 2026-03-16 04:36 before patch (1 file, +1/-0)
|
||||
2. eaf4c1f 2026-03-16 04:35 before write_file
|
||||
3. b3f9d2e 2026-03-16 04:34 before terminal: sed -i s/old/new/ config.py (1 file, +1/-1)
|
||||
|
||||
/rollback <N> restore to checkpoint N
|
||||
/rollback diff <N> preview changes since checkpoint N
|
||||
/rollback <N> <file> restore a single file from checkpoint N
|
||||
```
|
||||
|
||||
Each entry shows:
|
||||
|
||||
- Short hash
|
||||
- Timestamp
|
||||
- Reason (what triggered the snapshot)
|
||||
- Change summary (files changed, insertions/deletions)
|
||||
|
||||
## Previewing Changes with `/rollback diff`
|
||||
|
||||
Before committing to a restore, preview what has changed since a checkpoint:
|
||||
|
||||
```
|
||||
/rollback diff 1
|
||||
```
|
||||
|
||||
This shows a git diff stat summary followed by the actual diff:
|
||||
|
||||
```text
|
||||
test.py | 2 +-
|
||||
1 file changed, 1 insertion(+), 1 deletion(-)
|
||||
|
||||
diff --git a/test.py b/test.py
|
||||
--- a/test.py
|
||||
+++ b/test.py
|
||||
@@ -1 +1 @@
|
||||
-print('original content')
|
||||
+print('modified content')
|
||||
```
|
||||
|
||||
Long diffs are capped at 80 lines to avoid flooding the terminal.
|
||||
|
||||
## Restoring with `/rollback`
|
||||
|
||||
Restore to a checkpoint by number:
|
||||
|
||||
```
|
||||
/rollback 1
|
||||
```
|
||||
|
||||
Behind the scenes, Hermes:
|
||||
|
||||
1. Verifies the target commit exists in the shadow repo.
|
||||
2. Takes a **pre‑rollback snapshot** of the current state so you can "undo the undo" later.
|
||||
3. Restores tracked files in your working directory.
|
||||
4. **Undoes the last conversation turn** so the agent's context matches the restored filesystem state.
|
||||
|
||||
On success:
|
||||
|
||||
```text
|
||||
✅ Restored to checkpoint 4270a8c5: before patch
|
||||
A pre-rollback snapshot was saved automatically.
|
||||
(^_^)b Undid 4 message(s). Removed: "Now update test.py to ..."
|
||||
4 message(s) remaining in history.
|
||||
Chat turn undone to match restored file state.
|
||||
```
|
||||
|
||||
The conversation undo ensures the agent doesn't "remember" changes that have been rolled back, avoiding confusion on the next turn.
|
||||
|
||||
## Single-File Restore
|
||||
|
||||
Restore just one file from a checkpoint without affecting the rest of the directory:
|
||||
|
||||
```
|
||||
/rollback 1 src/broken_file.py
|
||||
```
|
||||
|
||||
This is useful when the agent made changes to multiple files but only one needs to be reverted.
|
||||
|
||||
## Safety and Performance Guards
|
||||
|
||||
To keep checkpointing safe and fast, Hermes applies several guardrails:
|
||||
|
||||
- **Git availability** — if `git` is not found on `PATH`, checkpoints are transparently disabled.
|
||||
- **Directory scope** — Hermes skips overly broad directories (root `/`, home `$HOME`).
|
||||
- **Repository size** — directories with more than 50,000 files are skipped to avoid slow git operations.
|
||||
- **No‑change snapshots** — if there are no changes since the last snapshot, the checkpoint is skipped.
|
||||
- **Non‑fatal errors** — all errors inside the Checkpoint Manager are logged at debug level; your tools continue to run.
|
||||
|
||||
## Where Checkpoints Live
|
||||
|
||||
All shadow repos live under:
|
||||
|
||||
```text
|
||||
~/.hermes/checkpoints/
|
||||
├── <hash1>/ # shadow git repo for one working directory
|
||||
├── <hash2>/
|
||||
└── ...
|
||||
```
|
||||
|
||||
Each `<hash>` is derived from the absolute path of the working directory. Inside each shadow repo you'll find:
|
||||
|
||||
- Standard git internals (`HEAD`, `refs/`, `objects/`)
|
||||
- An `info/exclude` file containing a curated ignore list
|
||||
- A `HERMES_WORKDIR` file pointing back to the original project root
|
||||
|
||||
You normally never need to touch these manually.
|
||||
|
||||
## Best Practices
|
||||
|
||||
- **Leave checkpoints enabled** — they're on by default and have zero cost when no files are modified.
|
||||
- **Use `/rollback diff` before restoring** — preview what will change to pick the right checkpoint.
|
||||
- **Use `/rollback` instead of `git reset`** when you want to undo agent-driven changes only.
|
||||
- **Combine with Git worktrees** for maximum safety — keep each Hermes session in its own worktree/branch, with checkpoints as an extra layer.
|
||||
|
||||
For running multiple agents in parallel on the same repo, see the guide on [Git worktrees](./git-worktrees.md).
|
||||
349
hermes_code/website/docs/user-guide/cli.md
Normal file
349
hermes_code/website/docs/user-guide/cli.md
Normal file
|
|
@ -0,0 +1,349 @@
|
|||
---
|
||||
sidebar_position: 1
|
||||
title: "CLI Interface"
|
||||
description: "Master the Hermes Agent terminal interface — commands, keybindings, personalities, and more"
|
||||
---
|
||||
|
||||
# CLI Interface
|
||||
|
||||
Hermes Agent's CLI is a full terminal user interface (TUI) — not a web UI. It features multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output. Built for people who live in the terminal.
|
||||
|
||||
## Running the CLI
|
||||
|
||||
```bash
|
||||
# Start an interactive session (default)
|
||||
hermes
|
||||
|
||||
# Single query mode (non-interactive)
|
||||
hermes chat -q "Hello"
|
||||
|
||||
# With a specific model
|
||||
hermes chat --model "anthropic/claude-sonnet-4"
|
||||
|
||||
# With a specific provider
|
||||
hermes chat --provider nous # Use Nous Portal
|
||||
hermes chat --provider openrouter # Force OpenRouter
|
||||
|
||||
# With specific toolsets
|
||||
hermes chat --toolsets "web,terminal,skills"
|
||||
|
||||
# Start with one or more skills preloaded
|
||||
hermes -s hermes-agent-dev,github-auth
|
||||
hermes chat -s github-pr-workflow -q "open a draft PR"
|
||||
|
||||
# Resume previous sessions
|
||||
hermes --continue # Resume the most recent CLI session (-c)
|
||||
hermes --resume <session_id> # Resume a specific session by ID (-r)
|
||||
|
||||
# Verbose mode (debug output)
|
||||
hermes chat --verbose
|
||||
|
||||
# Isolated git worktree (for running multiple agents in parallel)
|
||||
hermes -w # Interactive mode in worktree
|
||||
hermes -w -q "Fix issue #123" # Single query in worktree
|
||||
```
|
||||
|
||||
## Interface Layout
|
||||
|
||||
<img className="docs-terminal-figure" src="/img/docs/cli-layout.svg" alt="Stylized preview of the Hermes CLI layout showing the banner, conversation area, and fixed input prompt." />
|
||||
<p className="docs-figure-caption">The Hermes CLI banner, conversation stream, and fixed input prompt rendered as a stable docs figure instead of fragile text art.</p>
|
||||
|
||||
The welcome banner shows your model, terminal backend, working directory, available tools, and installed skills at a glance.
|
||||
|
||||
### Status Bar
|
||||
|
||||
A persistent status bar sits above the input area, updating in real time:
|
||||
|
||||
```
|
||||
⚕ claude-sonnet-4-20250514 │ 12.4K/200K │ [██████░░░░] 6% │ $0.06 │ 15m
|
||||
```
|
||||
|
||||
| Element | Description |
|
||||
|---------|-------------|
|
||||
| Model name | Current model (truncated if longer than 26 chars) |
|
||||
| Token count | Context tokens used / max context window |
|
||||
| Context bar | Visual fill indicator with color-coded thresholds |
|
||||
| Cost | Estimated session cost (or `n/a` for unknown/zero-priced models) |
|
||||
| Duration | Elapsed session time |
|
||||
|
||||
The bar adapts to terminal width — full layout at ≥ 76 columns, compact at 52–75, minimal (model + duration only) below 52.
|
||||
|
||||
**Context color coding:**
|
||||
|
||||
| Color | Threshold | Meaning |
|
||||
|-------|-----------|---------|
|
||||
| Green | < 50% | Plenty of room |
|
||||
| Yellow | 50–80% | Getting full |
|
||||
| Orange | 80–95% | Approaching limit |
|
||||
| Red | ≥ 95% | Near overflow — consider `/compress` |
|
||||
|
||||
Use `/usage` for a detailed breakdown including per-category costs (input vs output tokens).
|
||||
|
||||
### Session Resume Display
|
||||
|
||||
When resuming a previous session (`hermes -c` or `hermes --resume <id>`), a "Previous Conversation" panel appears between the banner and the input prompt, showing a compact recap of the conversation history. See [Sessions — Conversation Recap on Resume](sessions.md#conversation-recap-on-resume) for details and configuration.
|
||||
|
||||
## Keybindings
|
||||
|
||||
| Key | Action |
|
||||
|-----|--------|
|
||||
| `Enter` | Send message |
|
||||
| `Alt+Enter` or `Ctrl+J` | New line (multi-line input) |
|
||||
| `Alt+V` | Paste an image from the clipboard when supported by the terminal |
|
||||
| `Ctrl+V` | Paste text and opportunistically attach clipboard images |
|
||||
| `Ctrl+B` | Start/stop voice recording when voice mode is enabled (`voice.record_key`, default: `ctrl+b`) |
|
||||
| `Ctrl+C` | Interrupt agent (double-press within 2s to force exit) |
|
||||
| `Ctrl+D` | Exit |
|
||||
| `Tab` | Accept auto-suggestion (ghost text) or autocomplete slash commands |
|
||||
|
||||
## Slash Commands
|
||||
|
||||
Type `/` to see the autocomplete dropdown. Hermes supports a large set of CLI slash commands, dynamic skill commands, and user-defined quick commands.
|
||||
|
||||
Common examples:
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/help` | Show command help |
|
||||
| `/model` | Show or change the current model |
|
||||
| `/tools` | List currently available tools |
|
||||
| `/skills browse` | Browse the skills hub and official optional skills |
|
||||
| `/background <prompt>` | Run a prompt in a separate background session |
|
||||
| `/skin` | Show or switch the active CLI skin |
|
||||
| `/voice on` | Enable CLI voice mode (press `Ctrl+B` to record) |
|
||||
| `/voice tts` | Toggle spoken playback for Hermes replies |
|
||||
| `/reasoning high` | Increase reasoning effort |
|
||||
| `/title My Session` | Name the current session |
|
||||
|
||||
For the full built-in CLI and messaging lists, see [Slash Commands Reference](../reference/slash-commands.md).
|
||||
|
||||
For setup, providers, silence tuning, and messaging/Discord voice usage, see [Voice Mode](features/voice-mode.md).
|
||||
|
||||
:::tip
|
||||
Commands are case-insensitive — `/HELP` works the same as `/help`. Installed skills also become slash commands automatically.
|
||||
:::
|
||||
|
||||
## Quick Commands
|
||||
|
||||
You can define custom commands that run shell commands instantly without invoking the LLM. These work in both the CLI and messaging platforms (Telegram, Discord, etc.).
|
||||
|
||||
```yaml
|
||||
# ~/.hermes/config.yaml
|
||||
quick_commands:
|
||||
status:
|
||||
type: exec
|
||||
command: systemctl status hermes-agent
|
||||
gpu:
|
||||
type: exec
|
||||
command: nvidia-smi --query-gpu=utilization.gpu,memory.used --format=csv,noheader
|
||||
```
|
||||
|
||||
Then type `/status` or `/gpu` in any chat. See the [Configuration guide](/docs/user-guide/configuration#quick-commands) for more examples.
|
||||
|
||||
## Preloading Skills at Launch
|
||||
|
||||
If you already know which skills you want active for the session, pass them at launch time:
|
||||
|
||||
```bash
|
||||
hermes -s hermes-agent-dev,github-auth
|
||||
hermes chat -s github-pr-workflow -s github-auth
|
||||
```
|
||||
|
||||
Hermes loads each named skill into the session prompt before the first turn. The same flag works in interactive mode and single-query mode.
|
||||
|
||||
## Skill Slash Commands
|
||||
|
||||
Every installed skill in `~/.hermes/skills/` is automatically registered as a slash command. The skill name becomes the command:
|
||||
|
||||
```
|
||||
/gif-search funny cats
|
||||
/axolotl help me fine-tune Llama 3 on my dataset
|
||||
/github-pr-workflow create a PR for the auth refactor
|
||||
|
||||
# Just the skill name loads it and lets the agent ask what you need:
|
||||
/excalidraw
|
||||
```
|
||||
|
||||
## Personalities
|
||||
|
||||
Set a predefined personality to change the agent's tone:
|
||||
|
||||
```
|
||||
/personality pirate
|
||||
/personality kawaii
|
||||
/personality concise
|
||||
```
|
||||
|
||||
Built-in personalities include: `helpful`, `concise`, `technical`, `creative`, `teacher`, `kawaii`, `catgirl`, `pirate`, `shakespeare`, `surfer`, `noir`, `uwu`, `philosopher`, `hype`.
|
||||
|
||||
You can also define custom personalities in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
personalities:
|
||||
helpful: "You are a helpful, friendly AI assistant."
|
||||
kawaii: "You are a kawaii assistant! Use cute expressions..."
|
||||
pirate: "Arrr! Ye be talkin' to Captain Hermes..."
|
||||
# Add your own!
|
||||
```
|
||||
|
||||
## Multi-line Input
|
||||
|
||||
There are two ways to enter multi-line messages:
|
||||
|
||||
1. **`Alt+Enter` or `Ctrl+J`** — inserts a new line
|
||||
2. **Backslash continuation** — end a line with `\` to continue:
|
||||
|
||||
```
|
||||
❯ Write a function that:\
|
||||
1. Takes a list of numbers\
|
||||
2. Returns the sum
|
||||
```
|
||||
|
||||
:::info
|
||||
Pasting multi-line text is supported — use `Alt+Enter` or `Ctrl+J` to insert newlines, or simply paste content directly.
|
||||
:::
|
||||
|
||||
## Interrupting the Agent
|
||||
|
||||
You can interrupt the agent at any point:
|
||||
|
||||
- **Type a new message + Enter** while the agent is working — it interrupts and processes your new instructions
|
||||
- **`Ctrl+C`** — interrupt the current operation (press twice within 2s to force exit)
|
||||
- In-progress terminal commands are killed immediately (SIGTERM, then SIGKILL after 1s)
|
||||
- Multiple messages typed during interrupt are combined into one prompt
|
||||
|
||||
## Tool Progress Display
|
||||
|
||||
The CLI shows animated feedback as the agent works:
|
||||
|
||||
**Thinking animation** (during API calls):
|
||||
```
|
||||
◜ (。•́︿•̀。) pondering... (1.2s)
|
||||
◠ (⊙_⊙) contemplating... (2.4s)
|
||||
✧٩(ˊᗜˋ*)و✧ got it! (3.1s)
|
||||
```
|
||||
|
||||
**Tool execution feed:**
|
||||
```
|
||||
┊ 💻 terminal `ls -la` (0.3s)
|
||||
┊ 🔍 web_search (1.2s)
|
||||
┊ 📄 web_extract (2.1s)
|
||||
```
|
||||
|
||||
Cycle through display modes with `/verbose`: `off → new → all → verbose`.
|
||||
|
||||
## Session Management
|
||||
|
||||
### Resuming Sessions
|
||||
|
||||
When you exit a CLI session, a resume command is printed:
|
||||
|
||||
```
|
||||
Resume this session with:
|
||||
hermes --resume 20260225_143052_a1b2c3
|
||||
|
||||
Session: 20260225_143052_a1b2c3
|
||||
Duration: 12m 34s
|
||||
Messages: 28 (5 user, 18 tool calls)
|
||||
```
|
||||
|
||||
Resume options:
|
||||
|
||||
```bash
|
||||
hermes --continue # Resume the most recent CLI session
|
||||
hermes -c # Short form
|
||||
hermes -c "my project" # Resume a named session (latest in lineage)
|
||||
hermes --resume 20260225_143052_a1b2c3 # Resume a specific session by ID
|
||||
hermes --resume "refactoring auth" # Resume by title
|
||||
hermes -r 20260225_143052_a1b2c3 # Short form
|
||||
```
|
||||
|
||||
Resuming restores the full conversation history from SQLite. The agent sees all previous messages, tool calls, and responses — just as if you never left.
|
||||
|
||||
Use `/title My Session Name` inside a chat to name the current session, or `hermes sessions rename <id> <title>` from the command line. Use `hermes sessions list` to browse past sessions.
|
||||
|
||||
### Session Storage
|
||||
|
||||
CLI sessions are stored in Hermes's SQLite state database under `~/.hermes/state.db`. The database keeps:
|
||||
|
||||
- session metadata (ID, title, timestamps, token counters)
|
||||
- message history
|
||||
- lineage across compressed/resumed sessions
|
||||
- full-text search indexes used by `session_search`
|
||||
|
||||
Some messaging adapters also keep per-platform transcript files alongside the database, but the CLI itself resumes from the SQLite session store.
|
||||
|
||||
### Context Compression
|
||||
|
||||
Long conversations are automatically summarized when approaching context limits:
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
compression:
|
||||
enabled: true
|
||||
threshold: 0.50 # Compress at 50% of context limit by default
|
||||
summary_model: "google/gemini-3-flash-preview" # Model used for summarization
|
||||
```
|
||||
|
||||
When compression triggers, middle turns are summarized while the first 3 and last 4 turns are always preserved.
|
||||
|
||||
## Background Sessions
|
||||
|
||||
Run a prompt in a separate background session while continuing to use the CLI for other work:
|
||||
|
||||
```
|
||||
/background Analyze the logs in /var/log and summarize any errors from today
|
||||
```
|
||||
|
||||
Hermes immediately confirms the task and gives you back the prompt:
|
||||
|
||||
```
|
||||
🔄 Background task #1 started: "Analyze the logs in /var/log and summarize..."
|
||||
Task ID: bg_143022_a1b2c3
|
||||
```
|
||||
|
||||
### How It Works
|
||||
|
||||
Each `/background` prompt spawns a **completely separate agent session** in a daemon thread:
|
||||
|
||||
- **Isolated conversation** — the background agent has no knowledge of your current session's history. It receives only the prompt you provide.
|
||||
- **Same configuration** — the background agent inherits your model, provider, toolsets, reasoning settings, and fallback model from the current session.
|
||||
- **Non-blocking** — your foreground session stays fully interactive. You can chat, run commands, or even start more background tasks.
|
||||
- **Multiple tasks** — you can run several background tasks simultaneously. Each gets a numbered ID.
|
||||
|
||||
### Results
|
||||
|
||||
When a background task finishes, the result appears as a panel in your terminal:
|
||||
|
||||
```
|
||||
╭─ ⚕ Hermes (background #1) ──────────────────────────────────╮
|
||||
│ Found 3 errors in syslog from today: │
|
||||
│ 1. OOM killer invoked at 03:22 — killed process nginx │
|
||||
│ 2. Disk I/O error on /dev/sda1 at 07:15 │
|
||||
│ 3. Failed SSH login attempts from 192.168.1.50 at 14:30 │
|
||||
╰──────────────────────────────────────────────────────────────╯
|
||||
```
|
||||
|
||||
If the task fails, you'll see an error notification instead. If `display.bell_on_complete` is enabled in your config, the terminal bell rings when the task finishes.
|
||||
|
||||
### Use Cases
|
||||
|
||||
- **Long-running research** — "/background research the latest developments in quantum error correction" while you work on code
|
||||
- **File processing** — "/background analyze all Python files in this repo and list any security issues" while you continue a conversation
|
||||
- **Parallel investigations** — start multiple background tasks to explore different angles simultaneously
|
||||
|
||||
:::info
|
||||
Background sessions do not appear in your main conversation history. They are standalone sessions with their own task ID (e.g., `bg_143022_a1b2c3`).
|
||||
:::
|
||||
|
||||
## Quiet Mode
|
||||
|
||||
By default, the CLI runs in quiet mode which:
|
||||
- Suppresses verbose logging from tools
|
||||
- Enables kawaii-style animated feedback
|
||||
- Keeps output clean and user-friendly
|
||||
|
||||
For debug output:
|
||||
```bash
|
||||
hermes chat --verbose
|
||||
```
|
||||
1544
hermes_code/website/docs/user-guide/configuration.md
Normal file
1544
hermes_code/website/docs/user-guide/configuration.md
Normal file
File diff suppressed because it is too large
Load diff
|
|
@ -0,0 +1,8 @@
|
|||
{
|
||||
"label": "Features",
|
||||
"position": 4,
|
||||
"link": {
|
||||
"type": "generated-index",
|
||||
"description": "Explore the powerful features of Hermes Agent."
|
||||
}
|
||||
}
|
||||
197
hermes_code/website/docs/user-guide/features/acp.md
Normal file
197
hermes_code/website/docs/user-guide/features/acp.md
Normal file
|
|
@ -0,0 +1,197 @@
|
|||
---
|
||||
sidebar_position: 11
|
||||
title: "ACP Editor Integration"
|
||||
description: "Use Hermes Agent inside ACP-compatible editors such as VS Code, Zed, and JetBrains"
|
||||
---
|
||||
|
||||
# ACP Editor Integration
|
||||
|
||||
Hermes Agent can run as an ACP server, letting ACP-compatible editors talk to Hermes over stdio and render:
|
||||
|
||||
- chat messages
|
||||
- tool activity
|
||||
- file diffs
|
||||
- terminal commands
|
||||
- approval prompts
|
||||
- streamed thinking / response chunks
|
||||
|
||||
ACP is a good fit when you want Hermes to behave like an editor-native coding agent instead of a standalone CLI or messaging bot.
|
||||
|
||||
## What Hermes exposes in ACP mode
|
||||
|
||||
Hermes runs with a curated `hermes-acp` toolset designed for editor workflows. It includes:
|
||||
|
||||
- file tools: `read_file`, `write_file`, `patch`, `search_files`
|
||||
- terminal tools: `terminal`, `process`
|
||||
- web/browser tools
|
||||
- memory, todo, session search
|
||||
- skills
|
||||
- execute_code and delegate_task
|
||||
- vision
|
||||
|
||||
It intentionally excludes things that do not fit typical editor UX, such as messaging delivery and cronjob management.
|
||||
|
||||
## Installation
|
||||
|
||||
Install Hermes normally, then add the ACP extra:
|
||||
|
||||
```bash
|
||||
pip install -e '.[acp]'
|
||||
```
|
||||
|
||||
This installs the `agent-client-protocol` dependency and enables:
|
||||
|
||||
- `hermes acp`
|
||||
- `hermes-acp`
|
||||
- `python -m acp_adapter`
|
||||
|
||||
## Launching the ACP server
|
||||
|
||||
Any of the following starts Hermes in ACP mode:
|
||||
|
||||
```bash
|
||||
hermes acp
|
||||
```
|
||||
|
||||
```bash
|
||||
hermes-acp
|
||||
```
|
||||
|
||||
```bash
|
||||
python -m acp_adapter
|
||||
```
|
||||
|
||||
Hermes logs to stderr so stdout remains reserved for ACP JSON-RPC traffic.
|
||||
|
||||
## Editor setup
|
||||
|
||||
### VS Code
|
||||
|
||||
Install an ACP client extension, then point it at the repo's `acp_registry/` directory.
|
||||
|
||||
Example settings snippet:
|
||||
|
||||
```json
|
||||
{
|
||||
"acpClient.agents": [
|
||||
{
|
||||
"name": "hermes-agent",
|
||||
"registryDir": "/path/to/hermes-agent/acp_registry"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Zed
|
||||
|
||||
Example settings snippet:
|
||||
|
||||
```json
|
||||
{
|
||||
"acp": {
|
||||
"agents": [
|
||||
{
|
||||
"name": "hermes-agent",
|
||||
"registry_dir": "/path/to/hermes-agent/acp_registry"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### JetBrains
|
||||
|
||||
Use an ACP-compatible plugin and point it at:
|
||||
|
||||
```text
|
||||
/path/to/hermes-agent/acp_registry
|
||||
```
|
||||
|
||||
## Registry manifest
|
||||
|
||||
The ACP registry manifest lives at:
|
||||
|
||||
```text
|
||||
acp_registry/agent.json
|
||||
```
|
||||
|
||||
It advertises a command-based agent whose launch command is:
|
||||
|
||||
```text
|
||||
hermes acp
|
||||
```
|
||||
|
||||
## Configuration and credentials
|
||||
|
||||
ACP mode uses the same Hermes configuration as the CLI:
|
||||
|
||||
- `~/.hermes/.env`
|
||||
- `~/.hermes/config.yaml`
|
||||
- `~/.hermes/skills/`
|
||||
- `~/.hermes/state.db`
|
||||
|
||||
Provider resolution uses Hermes' normal runtime resolver, so ACP inherits the currently configured provider and credentials.
|
||||
|
||||
## Session behavior
|
||||
|
||||
ACP sessions are tracked by the ACP adapter's in-memory session manager while the server is running.
|
||||
|
||||
Each session stores:
|
||||
|
||||
- session ID
|
||||
- working directory
|
||||
- selected model
|
||||
- current conversation history
|
||||
- cancel event
|
||||
|
||||
The underlying `AIAgent` still uses Hermes' normal persistence/logging paths, but ACP `list/load/resume/fork` are scoped to the currently running ACP server process.
|
||||
|
||||
## Working directory behavior
|
||||
|
||||
ACP sessions bind the editor's cwd to the Hermes task ID so file and terminal tools run relative to the editor workspace, not the server process cwd.
|
||||
|
||||
## Approvals
|
||||
|
||||
Dangerous terminal commands can be routed back to the editor as approval prompts. ACP approval options are simpler than the CLI flow:
|
||||
|
||||
- allow once
|
||||
- allow always
|
||||
- deny
|
||||
|
||||
On timeout or error, the approval bridge denies the request.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### ACP agent does not appear in the editor
|
||||
|
||||
Check:
|
||||
|
||||
- the editor is pointed at the correct `acp_registry/` path
|
||||
- Hermes is installed and on your PATH
|
||||
- the ACP extra is installed (`pip install -e '.[acp]'`)
|
||||
|
||||
### ACP starts but immediately errors
|
||||
|
||||
Try these checks:
|
||||
|
||||
```bash
|
||||
hermes doctor
|
||||
hermes status
|
||||
hermes acp
|
||||
```
|
||||
|
||||
### Missing credentials
|
||||
|
||||
ACP mode does not have its own login flow. It uses Hermes' existing provider setup. Configure credentials with:
|
||||
|
||||
```bash
|
||||
hermes model
|
||||
```
|
||||
|
||||
or by editing `~/.hermes/.env`.
|
||||
|
||||
## See also
|
||||
|
||||
- [ACP Internals](../../developer-guide/acp-internals.md)
|
||||
- [Provider Runtime Resolution](../../developer-guide/provider-runtime.md)
|
||||
- [Tools Runtime](../../developer-guide/tools-runtime.md)
|
||||
236
hermes_code/website/docs/user-guide/features/api-server.md
Normal file
236
hermes_code/website/docs/user-guide/features/api-server.md
Normal file
|
|
@ -0,0 +1,236 @@
|
|||
---
|
||||
sidebar_position: 14
|
||||
title: "API Server"
|
||||
description: "Expose hermes-agent as an OpenAI-compatible API for any frontend"
|
||||
---
|
||||
|
||||
# API Server
|
||||
|
||||
The API server exposes hermes-agent as an OpenAI-compatible HTTP endpoint. Any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, NextChat, ChatBox, and hundreds more — can connect to hermes-agent and use it as a backend.
|
||||
|
||||
Your agent handles requests with its full toolset (terminal, file operations, web search, memory, skills) and returns the final response. Tool calls execute invisibly server-side.
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Enable the API server
|
||||
|
||||
Add to `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
API_SERVER_ENABLED=true
|
||||
API_SERVER_KEY=change-me-local-dev
|
||||
# Optional: only if a browser must call Hermes directly
|
||||
# API_SERVER_CORS_ORIGINS=http://localhost:3000
|
||||
```
|
||||
|
||||
### 2. Start the gateway
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
You'll see:
|
||||
|
||||
```
|
||||
[API Server] API server listening on http://127.0.0.1:8642
|
||||
```
|
||||
|
||||
### 3. Connect a frontend
|
||||
|
||||
Point any OpenAI-compatible client at `http://localhost:8642/v1`:
|
||||
|
||||
```bash
|
||||
# Test with curl
|
||||
curl http://localhost:8642/v1/chat/completions \
|
||||
-H "Authorization: Bearer change-me-local-dev" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"model": "hermes-agent", "messages": [{"role": "user", "content": "Hello!"}]}'
|
||||
```
|
||||
|
||||
Or connect Open WebUI, LobeChat, or any other frontend — see the [Open WebUI integration guide](/docs/user-guide/messaging/open-webui) for step-by-step instructions.
|
||||
|
||||
## Endpoints
|
||||
|
||||
### POST /v1/chat/completions
|
||||
|
||||
Standard OpenAI Chat Completions format. Stateless — the full conversation is included in each request via the `messages` array.
|
||||
|
||||
**Request:**
|
||||
```json
|
||||
{
|
||||
"model": "hermes-agent",
|
||||
"messages": [
|
||||
{"role": "system", "content": "You are a Python expert."},
|
||||
{"role": "user", "content": "Write a fibonacci function"}
|
||||
],
|
||||
"stream": false
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"id": "chatcmpl-abc123",
|
||||
"object": "chat.completion",
|
||||
"created": 1710000000,
|
||||
"model": "hermes-agent",
|
||||
"choices": [{
|
||||
"index": 0,
|
||||
"message": {"role": "assistant", "content": "Here's a fibonacci function..."},
|
||||
"finish_reason": "stop"
|
||||
}],
|
||||
"usage": {"prompt_tokens": 50, "completion_tokens": 200, "total_tokens": 250}
|
||||
}
|
||||
```
|
||||
|
||||
**Streaming** (`"stream": true`): Returns Server-Sent Events (SSE) with token-by-token response chunks. When streaming is enabled in config, tokens are emitted live as the LLM generates them. When disabled, the full response is sent as a single SSE chunk.
|
||||
|
||||
### POST /v1/responses
|
||||
|
||||
OpenAI Responses API format. Supports server-side conversation state via `previous_response_id` — the server stores full conversation history (including tool calls and results) so multi-turn context is preserved without the client managing it.
|
||||
|
||||
**Request:**
|
||||
```json
|
||||
{
|
||||
"model": "hermes-agent",
|
||||
"input": "What files are in my project?",
|
||||
"instructions": "You are a helpful coding assistant.",
|
||||
"store": true
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"id": "resp_abc123",
|
||||
"object": "response",
|
||||
"status": "completed",
|
||||
"model": "hermes-agent",
|
||||
"output": [
|
||||
{"type": "function_call", "name": "terminal", "arguments": "{\"command\": \"ls\"}", "call_id": "call_1"},
|
||||
{"type": "function_call_output", "call_id": "call_1", "output": "README.md src/ tests/"},
|
||||
{"type": "message", "role": "assistant", "content": [{"type": "output_text", "text": "Your project has..."}]}
|
||||
],
|
||||
"usage": {"input_tokens": 50, "output_tokens": 200, "total_tokens": 250}
|
||||
}
|
||||
```
|
||||
|
||||
#### Multi-turn with previous_response_id
|
||||
|
||||
Chain responses to maintain full context (including tool calls) across turns:
|
||||
|
||||
```json
|
||||
{
|
||||
"input": "Now show me the README",
|
||||
"previous_response_id": "resp_abc123"
|
||||
}
|
||||
```
|
||||
|
||||
The server reconstructs the full conversation from the stored response chain — all previous tool calls and results are preserved.
|
||||
|
||||
#### Named conversations
|
||||
|
||||
Use the `conversation` parameter instead of tracking response IDs:
|
||||
|
||||
```json
|
||||
{"input": "Hello", "conversation": "my-project"}
|
||||
{"input": "What's in src/?", "conversation": "my-project"}
|
||||
{"input": "Run the tests", "conversation": "my-project"}
|
||||
```
|
||||
|
||||
The server automatically chains to the latest response in that conversation. Like the `/title` command for gateway sessions.
|
||||
|
||||
### GET /v1/responses/\{id\}
|
||||
|
||||
Retrieve a previously stored response by ID.
|
||||
|
||||
### DELETE /v1/responses/\{id\}
|
||||
|
||||
Delete a stored response.
|
||||
|
||||
### GET /v1/models
|
||||
|
||||
Lists `hermes-agent` as an available model. Required by most frontends for model discovery.
|
||||
|
||||
### GET /health
|
||||
|
||||
Health check. Returns `{"status": "ok"}`.
|
||||
|
||||
## System Prompt Handling
|
||||
|
||||
When a frontend sends a `system` message (Chat Completions) or `instructions` field (Responses API), hermes-agent **layers it on top** of its core system prompt. Your agent keeps all its tools, memory, and skills — the frontend's system prompt adds extra instructions.
|
||||
|
||||
This means you can customize behavior per-frontend without losing capabilities:
|
||||
- Open WebUI system prompt: "You are a Python expert. Always include type hints."
|
||||
- The agent still has terminal, file tools, web search, memory, etc.
|
||||
|
||||
## Authentication
|
||||
|
||||
Bearer token auth via the `Authorization` header:
|
||||
|
||||
```
|
||||
Authorization: Bearer ***
|
||||
```
|
||||
|
||||
Configure the key via `API_SERVER_KEY` env var. If you need a browser to call Hermes directly, also set `API_SERVER_CORS_ORIGINS` to an explicit allowlist.
|
||||
|
||||
:::warning Security
|
||||
The API server gives full access to hermes-agent's toolset, **including terminal commands**. If you change the bind address to `0.0.0.0` (network-accessible), **always set `API_SERVER_KEY`** and keep `API_SERVER_CORS_ORIGINS` narrow — without that, remote callers may be able to execute arbitrary commands on your machine.
|
||||
|
||||
The default bind address (`127.0.0.1`) is for local-only use. Browser access is disabled by default; enable it only for explicit trusted origins.
|
||||
:::
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `API_SERVER_ENABLED` | `false` | Enable the API server |
|
||||
| `API_SERVER_PORT` | `8642` | HTTP server port |
|
||||
| `API_SERVER_HOST` | `127.0.0.1` | Bind address (localhost only by default) |
|
||||
| `API_SERVER_KEY` | _(none)_ | Bearer token for auth |
|
||||
| `API_SERVER_CORS_ORIGINS` | _(none)_ | Comma-separated allowed browser origins |
|
||||
|
||||
### config.yaml
|
||||
|
||||
```yaml
|
||||
# Not yet supported — use environment variables.
|
||||
# config.yaml support coming in a future release.
|
||||
```
|
||||
|
||||
## CORS
|
||||
|
||||
The API server does **not** enable browser CORS by default.
|
||||
|
||||
For direct browser access, set an explicit allowlist:
|
||||
|
||||
```bash
|
||||
API_SERVER_CORS_ORIGINS=http://localhost:3000,http://127.0.0.1:3000
|
||||
```
|
||||
|
||||
Most documented frontends such as Open WebUI connect server-to-server and do not need CORS at all.
|
||||
|
||||
## Compatible Frontends
|
||||
|
||||
Any frontend that supports the OpenAI API format works. Tested/documented integrations:
|
||||
|
||||
| Frontend | Stars | Connection |
|
||||
|----------|-------|------------|
|
||||
| [Open WebUI](/docs/user-guide/messaging/open-webui) | 126k | Full guide available |
|
||||
| LobeChat | 73k | Custom provider endpoint |
|
||||
| LibreChat | 34k | Custom endpoint in librechat.yaml |
|
||||
| AnythingLLM | 56k | Generic OpenAI provider |
|
||||
| NextChat | 87k | BASE_URL env var |
|
||||
| ChatBox | 39k | API Host setting |
|
||||
| Jan | 26k | Remote model config |
|
||||
| HF Chat-UI | 8k | OPENAI_BASE_URL |
|
||||
| big-AGI | 7k | Custom endpoint |
|
||||
| OpenAI Python SDK | — | `OpenAI(base_url="http://localhost:8642/v1")` |
|
||||
| curl | — | Direct HTTP requests |
|
||||
|
||||
## Limitations
|
||||
|
||||
- **Response storage** — stored responses (for `previous_response_id`) are persisted in SQLite and survive gateway restarts. Max 100 stored responses (LRU eviction).
|
||||
- **No file upload** — vision/document analysis via uploaded files is not yet supported through the API.
|
||||
- **Model field is cosmetic** — the `model` field in requests is accepted but the actual LLM model used is configured server-side in config.yaml.
|
||||
226
hermes_code/website/docs/user-guide/features/batch-processing.md
Normal file
226
hermes_code/website/docs/user-guide/features/batch-processing.md
Normal file
|
|
@ -0,0 +1,226 @@
|
|||
---
|
||||
sidebar_position: 12
|
||||
title: "Batch Processing"
|
||||
description: "Generate agent trajectories at scale — parallel processing, checkpointing, and toolset distributions"
|
||||
---
|
||||
|
||||
# Batch Processing
|
||||
|
||||
Batch processing lets you run the Hermes agent across hundreds or thousands of prompts in parallel, generating structured trajectory data. This is primarily used for **training data generation** — producing ShareGPT-format trajectories with tool usage statistics that can be used for fine-tuning or evaluation.
|
||||
|
||||
## Overview
|
||||
|
||||
The batch runner (`batch_runner.py`) processes a JSONL dataset of prompts, running each through a full agent session with tool access. Each prompt gets its own isolated environment. The output is structured trajectory data with full conversation history, tool call statistics, and reasoning coverage metrics.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Basic batch run
|
||||
python batch_runner.py \
|
||||
--dataset_file=data/prompts.jsonl \
|
||||
--batch_size=10 \
|
||||
--run_name=my_first_run \
|
||||
--model=anthropic/claude-sonnet-4-20250514 \
|
||||
--num_workers=4
|
||||
|
||||
# Resume an interrupted run
|
||||
python batch_runner.py \
|
||||
--dataset_file=data/prompts.jsonl \
|
||||
--batch_size=10 \
|
||||
--run_name=my_first_run \
|
||||
--resume
|
||||
|
||||
# List available toolset distributions
|
||||
python batch_runner.py --list_distributions
|
||||
```
|
||||
|
||||
## Dataset Format
|
||||
|
||||
The input dataset is a JSONL file (one JSON object per line). Each entry must have a `prompt` field:
|
||||
|
||||
```jsonl
|
||||
{"prompt": "Write a Python function that finds the longest palindromic substring"}
|
||||
{"prompt": "Create a REST API endpoint for user authentication using Flask"}
|
||||
{"prompt": "Debug this error: TypeError: cannot unpack non-iterable NoneType object"}
|
||||
```
|
||||
|
||||
Entries can optionally include:
|
||||
- `image` or `docker_image`: A container image to use for this prompt's sandbox (works with Docker, Modal, and Singularity backends)
|
||||
- `cwd`: Working directory override for the task's terminal session
|
||||
|
||||
## Configuration Options
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `--dataset_file` | (required) | Path to JSONL dataset |
|
||||
| `--batch_size` | (required) | Prompts per batch |
|
||||
| `--run_name` | (required) | Name for this run (used for output dir and checkpointing) |
|
||||
| `--distribution` | `"default"` | Toolset distribution to sample from |
|
||||
| `--model` | `claude-sonnet-4-20250514` | Model to use |
|
||||
| `--base_url` | `https://openrouter.ai/api/v1` | API base URL |
|
||||
| `--api_key` | (env var) | API key for model |
|
||||
| `--max_turns` | `10` | Maximum tool-calling iterations per prompt |
|
||||
| `--num_workers` | `4` | Parallel worker processes |
|
||||
| `--resume` | `false` | Resume from checkpoint |
|
||||
| `--verbose` | `false` | Enable verbose logging |
|
||||
| `--max_samples` | all | Only process first N samples from dataset |
|
||||
| `--max_tokens` | model default | Maximum tokens per model response |
|
||||
|
||||
### Provider Routing (OpenRouter)
|
||||
|
||||
| Parameter | Description |
|
||||
|-----------|-------------|
|
||||
| `--providers_allowed` | Comma-separated providers to allow (e.g., `"anthropic,openai"`) |
|
||||
| `--providers_ignored` | Comma-separated providers to ignore (e.g., `"together,deepinfra"`) |
|
||||
| `--providers_order` | Comma-separated preferred provider order |
|
||||
| `--provider_sort` | Sort by `"price"`, `"throughput"`, or `"latency"` |
|
||||
|
||||
### Reasoning Control
|
||||
|
||||
| Parameter | Description |
|
||||
|-----------|-------------|
|
||||
| `--reasoning_effort` | Effort level: `xhigh`, `high`, `medium`, `low`, `minimal`, `none` |
|
||||
| `--reasoning_disabled` | Completely disable reasoning/thinking tokens |
|
||||
|
||||
### Advanced Options
|
||||
|
||||
| Parameter | Description |
|
||||
|-----------|-------------|
|
||||
| `--ephemeral_system_prompt` | System prompt used during execution but NOT saved to trajectories |
|
||||
| `--log_prefix_chars` | Characters to show in log previews (default: 100) |
|
||||
| `--prefill_messages_file` | Path to JSON file with prefill messages for few-shot priming |
|
||||
|
||||
## Toolset Distributions
|
||||
|
||||
Each prompt gets a randomly sampled set of toolsets from a **distribution**. This ensures training data covers diverse tool combinations. Use `--list_distributions` to see all available distributions.
|
||||
|
||||
In the current implementation, distributions assign a probability to **each individual toolset**. The sampler flips each toolset independently, then guarantees that at least one toolset is enabled. This is different from a hand-authored table of prebuilt combinations.
|
||||
|
||||
## Output Format
|
||||
|
||||
All output goes to `data/<run_name>/`:
|
||||
|
||||
```text
|
||||
data/my_run/
|
||||
├── trajectories.jsonl # Combined final output (all batches merged)
|
||||
├── batch_0.jsonl # Individual batch results
|
||||
├── batch_1.jsonl
|
||||
├── ...
|
||||
├── checkpoint.json # Resume checkpoint
|
||||
└── statistics.json # Aggregate tool usage stats
|
||||
```
|
||||
|
||||
### Trajectory Format
|
||||
|
||||
Each line in `trajectories.jsonl` is a JSON object:
|
||||
|
||||
```json
|
||||
{
|
||||
"prompt_index": 42,
|
||||
"conversations": [
|
||||
{"from": "human", "value": "Write a function..."},
|
||||
{"from": "gpt", "value": "I'll create that function...",
|
||||
"tool_calls": [...]},
|
||||
{"from": "tool", "value": "..."},
|
||||
{"from": "gpt", "value": "Here's the completed function..."}
|
||||
],
|
||||
"metadata": {
|
||||
"batch_num": 2,
|
||||
"timestamp": "2026-01-15T10:30:00",
|
||||
"model": "anthropic/claude-sonnet-4-20250514"
|
||||
},
|
||||
"completed": true,
|
||||
"partial": false,
|
||||
"api_calls": 3,
|
||||
"toolsets_used": ["terminal", "file"],
|
||||
"tool_stats": {
|
||||
"terminal": {"count": 2, "success": 2, "failure": 0},
|
||||
"read_file": {"count": 1, "success": 1, "failure": 0}
|
||||
},
|
||||
"tool_error_counts": {
|
||||
"terminal": 0,
|
||||
"read_file": 0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `conversations` field uses a ShareGPT-like format with `from` and `value` fields. Tool stats are normalized to include all possible tools with zero defaults, ensuring consistent schema across entries for HuggingFace datasets compatibility.
|
||||
|
||||
## Checkpointing
|
||||
|
||||
The batch runner has robust checkpointing for fault tolerance:
|
||||
|
||||
- **Checkpoint file:** Saved after each batch completes, tracking which prompt indices are done
|
||||
- **Content-based resume:** On `--resume`, the runner scans existing batch files and matches completed prompts by their actual text content (not just indices), enabling recovery even if the dataset order changes
|
||||
- **Failed prompts:** Only successfully completed prompts are marked as done — failed prompts will be retried on resume
|
||||
- **Batch merging:** On completion, all batch files (including from previous runs) are merged into a single `trajectories.jsonl`
|
||||
|
||||
### How Resume Works
|
||||
|
||||
1. Scan all `batch_*.jsonl` files for completed prompts (by content matching)
|
||||
2. Filter the dataset to exclude already-completed prompts
|
||||
3. Re-batch the remaining prompts
|
||||
4. Process only the remaining prompts
|
||||
5. Merge all batch files (old + new) into final output
|
||||
|
||||
## Quality Filtering
|
||||
|
||||
The batch runner applies automatic quality filtering:
|
||||
|
||||
- **No-reasoning filter:** Samples where zero assistant turns contain reasoning (no `<REASONING_SCRATCHPAD>` or native thinking tokens) are discarded
|
||||
- **Corrupted entry filter:** Entries with hallucinated tool names (not in the valid tool list) are filtered out during the final merge
|
||||
- **Reasoning statistics:** Tracks percentage of turns with/without reasoning across the entire run
|
||||
|
||||
## Statistics
|
||||
|
||||
After completion, the runner prints comprehensive statistics:
|
||||
|
||||
- **Tool usage:** Call counts, success/failure rates per tool
|
||||
- **Reasoning coverage:** Percentage of assistant turns with reasoning
|
||||
- **Samples discarded:** Count of samples filtered for lacking reasoning
|
||||
- **Duration:** Total processing time
|
||||
|
||||
Statistics are also saved to `statistics.json` for programmatic analysis.
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Training Data Generation
|
||||
|
||||
Generate diverse tool-use trajectories for fine-tuning:
|
||||
|
||||
```bash
|
||||
python batch_runner.py \
|
||||
--dataset_file=data/coding_prompts.jsonl \
|
||||
--batch_size=20 \
|
||||
--run_name=coding_v1 \
|
||||
--model=anthropic/claude-sonnet-4-20250514 \
|
||||
--num_workers=8 \
|
||||
--distribution=default \
|
||||
--max_turns=15
|
||||
```
|
||||
|
||||
### Model Evaluation
|
||||
|
||||
Evaluate how well a model uses tools across standardized prompts:
|
||||
|
||||
```bash
|
||||
python batch_runner.py \
|
||||
--dataset_file=data/eval_suite.jsonl \
|
||||
--batch_size=10 \
|
||||
--run_name=eval_gpt4 \
|
||||
--model=openai/gpt-4o \
|
||||
--num_workers=4 \
|
||||
--max_turns=10
|
||||
```
|
||||
|
||||
### Per-Prompt Container Images
|
||||
|
||||
For benchmarks requiring specific environments, each prompt can specify its own container image:
|
||||
|
||||
```jsonl
|
||||
{"prompt": "Install numpy and compute eigenvalues of a 3x3 matrix", "image": "python:3.11-slim"}
|
||||
{"prompt": "Compile this Rust program and run it", "image": "rust:1.75"}
|
||||
{"prompt": "Set up a Node.js Express server", "image": "node:20-alpine", "cwd": "/app"}
|
||||
```
|
||||
|
||||
The batch runner verifies Docker images are accessible before running each prompt.
|
||||
281
hermes_code/website/docs/user-guide/features/browser.md
Normal file
281
hermes_code/website/docs/user-guide/features/browser.md
Normal file
|
|
@ -0,0 +1,281 @@
|
|||
---
|
||||
title: Browser Automation
|
||||
description: Control browsers with multiple providers, local Chrome via CDP, or cloud browsers for web interaction, form filling, scraping, and more.
|
||||
sidebar_label: Browser
|
||||
sidebar_position: 5
|
||||
---
|
||||
|
||||
# Browser Automation
|
||||
|
||||
Hermes Agent includes a full browser automation toolset with multiple backend options:
|
||||
|
||||
- **Browserbase cloud mode** via [Browserbase](https://browserbase.com) for managed cloud browsers and anti-bot tooling
|
||||
- **Browser Use cloud mode** via [Browser Use](https://browser-use.com) as an alternative cloud browser provider
|
||||
- **Local Chrome via CDP** — connect browser tools to your own Chrome instance using `/browser connect`
|
||||
- **Local browser mode** via the `agent-browser` CLI and a local Chromium installation
|
||||
|
||||
In all modes, the agent can navigate websites, interact with page elements, fill forms, and extract information.
|
||||
|
||||
## Overview
|
||||
|
||||
Pages are represented as **accessibility trees** (text-based snapshots), making them ideal for LLM agents. Interactive elements get ref IDs (like `@e1`, `@e2`) that the agent uses for clicking and typing.
|
||||
|
||||
Key capabilities:
|
||||
|
||||
- **Multi-provider cloud execution** — Browserbase or Browser Use, no local browser needed
|
||||
- **Local Chrome integration** — attach to your running Chrome via CDP for hands-on browsing
|
||||
- **Built-in stealth** — random fingerprints, CAPTCHA solving, residential proxies (Browserbase)
|
||||
- **Session isolation** — each task gets its own browser session
|
||||
- **Automatic cleanup** — inactive sessions are closed after a timeout
|
||||
- **Vision analysis** — screenshot + AI analysis for visual understanding
|
||||
|
||||
## Setup
|
||||
|
||||
### Browserbase cloud mode
|
||||
|
||||
To use Browserbase-managed cloud browsers, add:
|
||||
|
||||
```bash
|
||||
# Add to ~/.hermes/.env
|
||||
BROWSERBASE_API_KEY=***
|
||||
BROWSERBASE_PROJECT_ID=your-project-id-here
|
||||
```
|
||||
|
||||
Get your credentials at [browserbase.com](https://browserbase.com).
|
||||
|
||||
### Browser Use cloud mode
|
||||
|
||||
To use Browser Use as your cloud browser provider, add:
|
||||
|
||||
```bash
|
||||
# Add to ~/.hermes/.env
|
||||
BROWSER_USE_API_KEY=***
|
||||
```
|
||||
|
||||
Get your API key at [browser-use.com](https://browser-use.com). Browser Use provides a cloud browser via its REST API. If both Browserbase and Browser Use credentials are set, Browserbase takes priority.
|
||||
|
||||
### Local Chrome via CDP (`/browser connect`)
|
||||
|
||||
Instead of a cloud provider, you can attach Hermes browser tools to your own running Chrome instance via the Chrome DevTools Protocol (CDP). This is useful when you want to see what the agent is doing in real-time, interact with pages that require your own cookies/sessions, or avoid cloud browser costs.
|
||||
|
||||
In the CLI, use:
|
||||
|
||||
```
|
||||
/browser connect # Connect to Chrome at ws://localhost:9222
|
||||
/browser connect ws://host:port # Connect to a specific CDP endpoint
|
||||
/browser status # Check current connection
|
||||
/browser disconnect # Detach and return to cloud/local mode
|
||||
```
|
||||
|
||||
If Chrome isn't already running with remote debugging, Hermes will attempt to auto-launch it with `--remote-debugging-port=9222`.
|
||||
|
||||
:::tip
|
||||
To start Chrome manually with CDP enabled:
|
||||
```bash
|
||||
# Linux
|
||||
google-chrome --remote-debugging-port=9222
|
||||
|
||||
# macOS
|
||||
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222
|
||||
```
|
||||
:::
|
||||
|
||||
When connected via CDP, all browser tools (`browser_navigate`, `browser_click`, etc.) operate on your live Chrome instance instead of spinning up a cloud session.
|
||||
|
||||
### Local browser mode
|
||||
|
||||
If you do **not** set any cloud credentials and don't use `/browser connect`, Hermes can still use the browser tools through a local Chromium install driven by `agent-browser`.
|
||||
|
||||
### Optional Environment Variables
|
||||
|
||||
```bash
|
||||
# Residential proxies for better CAPTCHA solving (default: "true")
|
||||
BROWSERBASE_PROXIES=true
|
||||
|
||||
# Advanced stealth with custom Chromium — requires Scale Plan (default: "false")
|
||||
BROWSERBASE_ADVANCED_STEALTH=false
|
||||
|
||||
# Session reconnection after disconnects — requires paid plan (default: "true")
|
||||
BROWSERBASE_KEEP_ALIVE=true
|
||||
|
||||
# Custom session timeout in milliseconds (default: project default)
|
||||
# Examples: 600000 (10min), 1800000 (30min)
|
||||
BROWSERBASE_SESSION_TIMEOUT=600000
|
||||
|
||||
# Inactivity timeout before auto-cleanup in seconds (default: 300)
|
||||
BROWSER_INACTIVITY_TIMEOUT=300
|
||||
```
|
||||
|
||||
### Install agent-browser CLI
|
||||
|
||||
```bash
|
||||
npm install -g agent-browser
|
||||
# Or install locally in the repo:
|
||||
npm install
|
||||
```
|
||||
|
||||
:::info
|
||||
The `browser` toolset must be included in your config's `toolsets` list or enabled via `hermes config set toolsets '["hermes-cli", "browser"]'`.
|
||||
:::
|
||||
|
||||
## Available Tools
|
||||
|
||||
### `browser_navigate`
|
||||
|
||||
Navigate to a URL. Must be called before any other browser tool. Initializes the Browserbase session.
|
||||
|
||||
```
|
||||
Navigate to https://github.com/NousResearch
|
||||
```
|
||||
|
||||
:::tip
|
||||
For simple information retrieval, prefer `web_search` or `web_extract` — they are faster and cheaper. Use browser tools when you need to **interact** with a page (click buttons, fill forms, handle dynamic content).
|
||||
:::
|
||||
|
||||
### `browser_snapshot`
|
||||
|
||||
Get a text-based snapshot of the current page's accessibility tree. Returns interactive elements with ref IDs like `@e1`, `@e2` for use with `browser_click` and `browser_type`.
|
||||
|
||||
- **`full=false`** (default): Compact view showing only interactive elements
|
||||
- **`full=true`**: Complete page content
|
||||
|
||||
Snapshots over 8000 characters are automatically summarized by an LLM.
|
||||
|
||||
### `browser_click`
|
||||
|
||||
Click an element identified by its ref ID from the snapshot.
|
||||
|
||||
```
|
||||
Click @e5 to press the "Sign In" button
|
||||
```
|
||||
|
||||
### `browser_type`
|
||||
|
||||
Type text into an input field. Clears the field first, then types the new text.
|
||||
|
||||
```
|
||||
Type "hermes agent" into the search field @e3
|
||||
```
|
||||
|
||||
### `browser_scroll`
|
||||
|
||||
Scroll the page up or down to reveal more content.
|
||||
|
||||
```
|
||||
Scroll down to see more results
|
||||
```
|
||||
|
||||
### `browser_press`
|
||||
|
||||
Press a keyboard key. Useful for submitting forms or navigation.
|
||||
|
||||
```
|
||||
Press Enter to submit the form
|
||||
```
|
||||
|
||||
Supported keys: `Enter`, `Tab`, `Escape`, `ArrowDown`, `ArrowUp`, and more.
|
||||
|
||||
### `browser_back`
|
||||
|
||||
Navigate back to the previous page in browser history.
|
||||
|
||||
### `browser_get_images`
|
||||
|
||||
List all images on the current page with their URLs and alt text. Useful for finding images to analyze.
|
||||
|
||||
### `browser_vision`
|
||||
|
||||
Take a screenshot and analyze it with vision AI. Use this when text snapshots don't capture important visual information — especially useful for CAPTCHAs, complex layouts, or visual verification challenges.
|
||||
|
||||
The screenshot is saved persistently and the file path is returned alongside the AI analysis. On messaging platforms (Telegram, Discord, Slack, WhatsApp), you can ask the agent to share the screenshot — it will be sent as a native photo attachment via the `MEDIA:` mechanism.
|
||||
|
||||
```
|
||||
What does the chart on this page show?
|
||||
```
|
||||
|
||||
Screenshots are stored in `~/.hermes/browser_screenshots/` and automatically cleaned up after 24 hours.
|
||||
|
||||
### `browser_console`
|
||||
|
||||
Get browser console output (log/warn/error messages) and uncaught JavaScript exceptions from the current page. Essential for detecting silent JS errors that don't appear in the accessibility tree.
|
||||
|
||||
```
|
||||
Check the browser console for any JavaScript errors
|
||||
```
|
||||
|
||||
Use `clear=True` to clear the console after reading, so subsequent calls only show new messages.
|
||||
|
||||
### `browser_close`
|
||||
|
||||
Close the browser session and release resources. Call this when done to free up Browserbase session quota.
|
||||
|
||||
## Practical Examples
|
||||
|
||||
### Filling Out a Web Form
|
||||
|
||||
```
|
||||
User: Sign up for an account on example.com with my email john@example.com
|
||||
|
||||
Agent workflow:
|
||||
1. browser_navigate("https://example.com/signup")
|
||||
2. browser_snapshot() → sees form fields with refs
|
||||
3. browser_type(ref="@e3", text="john@example.com")
|
||||
4. browser_type(ref="@e5", text="SecurePass123")
|
||||
5. browser_click(ref="@e8") → clicks "Create Account"
|
||||
6. browser_snapshot() → confirms success
|
||||
7. browser_close()
|
||||
```
|
||||
|
||||
### Researching Dynamic Content
|
||||
|
||||
```
|
||||
User: What are the top trending repos on GitHub right now?
|
||||
|
||||
Agent workflow:
|
||||
1. browser_navigate("https://github.com/trending")
|
||||
2. browser_snapshot(full=true) → reads trending repo list
|
||||
3. Returns formatted results
|
||||
4. browser_close()
|
||||
```
|
||||
|
||||
## Session Recording
|
||||
|
||||
Automatically record browser sessions as WebM video files:
|
||||
|
||||
```yaml
|
||||
browser:
|
||||
record_sessions: true # default: false
|
||||
```
|
||||
|
||||
When enabled, recording starts automatically on the first `browser_navigate` and saves to `~/.hermes/browser_recordings/` when the session closes. Works in both local and cloud (Browserbase) modes. Recordings older than 72 hours are automatically cleaned up.
|
||||
|
||||
## Stealth Features
|
||||
|
||||
Browserbase provides automatic stealth capabilities:
|
||||
|
||||
| Feature | Default | Notes |
|
||||
|---------|---------|-------|
|
||||
| Basic Stealth | Always on | Random fingerprints, viewport randomization, CAPTCHA solving |
|
||||
| Residential Proxies | On | Routes through residential IPs for better access |
|
||||
| Advanced Stealth | Off | Custom Chromium build, requires Scale Plan |
|
||||
| Keep Alive | On | Session reconnection after network hiccups |
|
||||
|
||||
:::note
|
||||
If paid features aren't available on your plan, Hermes automatically falls back — first disabling `keepAlive`, then proxies — so browsing still works on free plans.
|
||||
:::
|
||||
|
||||
## Session Management
|
||||
|
||||
- Each task gets an isolated browser session via Browserbase
|
||||
- Sessions are automatically cleaned up after inactivity (default: 5 minutes)
|
||||
- A background thread checks every 30 seconds for stale sessions
|
||||
- Emergency cleanup runs on process exit to prevent orphaned sessions
|
||||
- Sessions are released via the Browserbase API (`REQUEST_RELEASE` status)
|
||||
|
||||
## Limitations
|
||||
|
||||
- **Text-based interaction** — relies on accessibility tree, not pixel coordinates
|
||||
- **Snapshot size** — large pages may be truncated or LLM-summarized at 8000 characters
|
||||
- **Session timeout** — cloud sessions expire based on your provider's plan settings
|
||||
- **Cost** — cloud sessions consume provider credits; use `browser_close` when done. Use `/browser connect` for free local browsing.
|
||||
- **No file downloads** — cannot download files from the browser
|
||||
30
hermes_code/website/docs/user-guide/features/checkpoints.md
Normal file
30
hermes_code/website/docs/user-guide/features/checkpoints.md
Normal file
|
|
@ -0,0 +1,30 @@
|
|||
# Filesystem Checkpoints
|
||||
|
||||
Hermes automatically snapshots your working directory before making file changes, giving you a safety net to roll back if something goes wrong. Checkpoints are **enabled by default**.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/rollback` | List all checkpoints with change stats |
|
||||
| `/rollback <N>` | Restore to checkpoint N (also undoes last chat turn) |
|
||||
| `/rollback diff <N>` | Preview diff between checkpoint N and current state |
|
||||
| `/rollback <N> <file>` | Restore a single file from checkpoint N |
|
||||
|
||||
## What Triggers Checkpoints
|
||||
|
||||
- **File tools** — `write_file` and `patch`
|
||||
- **Destructive terminal commands** — `rm`, `mv`, `sed -i`, output redirects (`>`), `git reset`/`clean`
|
||||
|
||||
## Configuration
|
||||
|
||||
```yaml
|
||||
# ~/.hermes/config.yaml
|
||||
checkpoints:
|
||||
enabled: true # default: true
|
||||
max_snapshots: 50 # max checkpoints per directory
|
||||
```
|
||||
|
||||
## Learn More
|
||||
|
||||
For the full guide — how shadow repos work, diff previews, file-level restore, conversation undo, safety guards, and best practices — see **[Checkpoints and /rollback](../checkpoints-and-rollback.md)**.
|
||||
210
hermes_code/website/docs/user-guide/features/code-execution.md
Normal file
210
hermes_code/website/docs/user-guide/features/code-execution.md
Normal file
|
|
@ -0,0 +1,210 @@
|
|||
---
|
||||
sidebar_position: 8
|
||||
title: "Code Execution"
|
||||
description: "Sandboxed Python execution with RPC tool access — collapse multi-step workflows into a single turn"
|
||||
---
|
||||
|
||||
# Code Execution (Programmatic Tool Calling)
|
||||
|
||||
The `execute_code` tool lets the agent write Python scripts that call Hermes tools programmatically, collapsing multi-step workflows into a single LLM turn. The script runs in a sandboxed child process on the agent host, communicating via Unix domain socket RPC.
|
||||
|
||||
## How It Works
|
||||
|
||||
1. The agent writes a Python script using `from hermes_tools import ...`
|
||||
2. Hermes generates a `hermes_tools.py` stub module with RPC functions
|
||||
3. Hermes opens a Unix domain socket and starts an RPC listener thread
|
||||
4. The script runs in a child process — tool calls travel over the socket back to Hermes
|
||||
5. Only the script's `print()` output is returned to the LLM; intermediate tool results never enter the context window
|
||||
|
||||
```python
|
||||
# The agent can write scripts like:
|
||||
from hermes_tools import web_search, web_extract
|
||||
|
||||
results = web_search("Python 3.13 features", limit=5)
|
||||
for r in results["data"]["web"]:
|
||||
content = web_extract([r["url"]])
|
||||
# ... filter and process ...
|
||||
print(summary)
|
||||
```
|
||||
|
||||
**Available tools in sandbox:** `web_search`, `web_extract`, `read_file`, `write_file`, `search_files`, `patch`, `terminal` (foreground only).
|
||||
|
||||
## When the Agent Uses This
|
||||
|
||||
The agent uses `execute_code` when there are:
|
||||
|
||||
- **3+ tool calls** with processing logic between them
|
||||
- Bulk data filtering or conditional branching
|
||||
- Loops over results
|
||||
|
||||
The key benefit: intermediate tool results never enter the context window — only the final `print()` output comes back, dramatically reducing token usage.
|
||||
|
||||
## Practical Examples
|
||||
|
||||
### Data Processing Pipeline
|
||||
|
||||
```python
|
||||
from hermes_tools import search_files, read_file
|
||||
import json
|
||||
|
||||
# Find all config files and extract database settings
|
||||
matches = search_files("database", path=".", file_glob="*.yaml", limit=20)
|
||||
configs = []
|
||||
for match in matches.get("matches", []):
|
||||
content = read_file(match["path"])
|
||||
configs.append({"file": match["path"], "preview": content["content"][:200]})
|
||||
|
||||
print(json.dumps(configs, indent=2))
|
||||
```
|
||||
|
||||
### Multi-Step Web Research
|
||||
|
||||
```python
|
||||
from hermes_tools import web_search, web_extract
|
||||
import json
|
||||
|
||||
# Search, extract, and summarize in one turn
|
||||
results = web_search("Rust async runtime comparison 2025", limit=5)
|
||||
summaries = []
|
||||
for r in results["data"]["web"]:
|
||||
page = web_extract([r["url"]])
|
||||
for p in page.get("results", []):
|
||||
if p.get("content"):
|
||||
summaries.append({
|
||||
"title": r["title"],
|
||||
"url": r["url"],
|
||||
"excerpt": p["content"][:500]
|
||||
})
|
||||
|
||||
print(json.dumps(summaries, indent=2))
|
||||
```
|
||||
|
||||
### Bulk File Refactoring
|
||||
|
||||
```python
|
||||
from hermes_tools import search_files, read_file, patch
|
||||
|
||||
# Find all Python files using deprecated API and fix them
|
||||
matches = search_files("old_api_call", path="src/", file_glob="*.py")
|
||||
fixed = 0
|
||||
for match in matches.get("matches", []):
|
||||
result = patch(
|
||||
path=match["path"],
|
||||
old_string="old_api_call(",
|
||||
new_string="new_api_call(",
|
||||
replace_all=True
|
||||
)
|
||||
if "error" not in str(result):
|
||||
fixed += 1
|
||||
|
||||
print(f"Fixed {fixed} files out of {len(matches.get('matches', []))} matches")
|
||||
```
|
||||
|
||||
### Build and Test Pipeline
|
||||
|
||||
```python
|
||||
from hermes_tools import terminal, read_file
|
||||
import json
|
||||
|
||||
# Run tests, parse results, and report
|
||||
result = terminal("cd /project && python -m pytest --tb=short -q 2>&1", timeout=120)
|
||||
output = result.get("output", "")
|
||||
|
||||
# Parse test output
|
||||
passed = output.count(" passed")
|
||||
failed = output.count(" failed")
|
||||
errors = output.count(" error")
|
||||
|
||||
report = {
|
||||
"passed": passed,
|
||||
"failed": failed,
|
||||
"errors": errors,
|
||||
"exit_code": result.get("exit_code", -1),
|
||||
"summary": output[-500:] if len(output) > 500 else output
|
||||
}
|
||||
|
||||
print(json.dumps(report, indent=2))
|
||||
```
|
||||
|
||||
## Resource Limits
|
||||
|
||||
| Resource | Limit | Notes |
|
||||
|----------|-------|-------|
|
||||
| **Timeout** | 5 minutes (300s) | Script is killed with SIGTERM, then SIGKILL after 5s grace |
|
||||
| **Stdout** | 50 KB | Output truncated with `[output truncated at 50KB]` notice |
|
||||
| **Stderr** | 10 KB | Included in output on non-zero exit for debugging |
|
||||
| **Tool calls** | 50 per execution | Error returned when limit reached |
|
||||
|
||||
All limits are configurable via `config.yaml`:
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
code_execution:
|
||||
timeout: 300 # Max seconds per script (default: 300)
|
||||
max_tool_calls: 50 # Max tool calls per execution (default: 50)
|
||||
```
|
||||
|
||||
## How Tool Calls Work Inside Scripts
|
||||
|
||||
When your script calls a function like `web_search("query")`:
|
||||
|
||||
1. The call is serialized to JSON and sent over a Unix domain socket to the parent process
|
||||
2. The parent dispatches through the standard `handle_function_call` handler
|
||||
3. The result is sent back over the socket
|
||||
4. The function returns the parsed result
|
||||
|
||||
This means tool calls inside scripts behave identically to normal tool calls — same rate limits, same error handling, same capabilities. The only restriction is that `terminal()` is foreground-only (no `background`, `pty`, or `check_interval` parameters).
|
||||
|
||||
## Error Handling
|
||||
|
||||
When a script fails, the agent receives structured error information:
|
||||
|
||||
- **Non-zero exit code**: stderr is included in the output so the agent sees the full traceback
|
||||
- **Timeout**: Script is killed and the agent sees `"Script timed out after 300s and was killed."`
|
||||
- **Interruption**: If the user sends a new message during execution, the script is terminated and the agent sees `[execution interrupted — user sent a new message]`
|
||||
- **Tool call limit**: When the 50-call limit is hit, subsequent tool calls return an error message
|
||||
|
||||
The response always includes `status` (success/error/timeout/interrupted), `output`, `tool_calls_made`, and `duration_seconds`.
|
||||
|
||||
## Security
|
||||
|
||||
:::danger Security Model
|
||||
The child process runs with a **minimal environment**. API keys, tokens, and credentials are stripped by default. The script accesses tools exclusively via the RPC channel — it cannot read secrets from environment variables unless explicitly allowed.
|
||||
:::
|
||||
|
||||
Environment variables containing `KEY`, `TOKEN`, `SECRET`, `PASSWORD`, `CREDENTIAL`, `PASSWD`, or `AUTH` in their names are excluded. Only safe system variables (`PATH`, `HOME`, `LANG`, `SHELL`, `PYTHONPATH`, `VIRTUAL_ENV`, etc.) are passed through.
|
||||
|
||||
### Skill Environment Variable Passthrough
|
||||
|
||||
When a skill declares `required_environment_variables` in its frontmatter, those variables are **automatically passed through** to both `execute_code` and `terminal` sandboxes after the skill is loaded. This lets skills use their declared API keys without weakening the security posture for arbitrary code.
|
||||
|
||||
For non-skill use cases, you can explicitly allowlist variables in `config.yaml`:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
env_passthrough:
|
||||
- MY_CUSTOM_KEY
|
||||
- ANOTHER_TOKEN
|
||||
```
|
||||
|
||||
See the [Security guide](/docs/user-guide/security#environment-variable-passthrough) for full details.
|
||||
|
||||
The script runs in a temporary directory that is cleaned up after execution. The child process runs in its own process group so it can be cleanly killed on timeout or interruption.
|
||||
|
||||
## execute_code vs terminal
|
||||
|
||||
| Use Case | execute_code | terminal |
|
||||
|----------|-------------|----------|
|
||||
| Multi-step workflows with tool calls between | ✅ | ❌ |
|
||||
| Simple shell command | ❌ | ✅ |
|
||||
| Filtering/processing large tool outputs | ✅ | ❌ |
|
||||
| Running a build or test suite | ❌ | ✅ |
|
||||
| Looping over search results | ✅ | ❌ |
|
||||
| Interactive/background processes | ❌ | ✅ |
|
||||
| Needs API keys in environment | ⚠️ Only via [passthrough](/docs/user-guide/security#environment-variable-passthrough) | ✅ (most pass through) |
|
||||
|
||||
**Rule of thumb:** Use `execute_code` when you need to call Hermes tools programmatically with logic between calls. Use `terminal` for running shell commands, builds, and processes.
|
||||
|
||||
## Platform Support
|
||||
|
||||
Code execution requires Unix domain sockets and is available on **Linux and macOS only**. It is automatically disabled on Windows — the agent falls back to regular sequential tool calls.
|
||||
201
hermes_code/website/docs/user-guide/features/context-files.md
Normal file
201
hermes_code/website/docs/user-guide/features/context-files.md
Normal file
|
|
@ -0,0 +1,201 @@
|
|||
---
|
||||
sidebar_position: 8
|
||||
title: "Context Files"
|
||||
description: "Project context files — .hermes.md, AGENTS.md, CLAUDE.md, global SOUL.md, and .cursorrules — automatically injected into every conversation"
|
||||
---
|
||||
|
||||
# Context Files
|
||||
|
||||
Hermes Agent automatically discovers and loads context files that shape how it behaves. Some are project-local and discovered from your working directory. `SOUL.md` is now global to the Hermes instance and is loaded from `HERMES_HOME` only.
|
||||
|
||||
## Supported Context Files
|
||||
|
||||
| File | Purpose | Discovery |
|
||||
|------|---------|-----------|
|
||||
| **.hermes.md** / **HERMES.md** | Project instructions (highest priority) | Walks to git root |
|
||||
| **AGENTS.md** | Project instructions, conventions, architecture | Recursive (walks subdirectories) |
|
||||
| **CLAUDE.md** | Claude Code context files (also detected) | CWD only |
|
||||
| **SOUL.md** | Global personality and tone customization for this Hermes instance | `HERMES_HOME/SOUL.md` only |
|
||||
| **.cursorrules** | Cursor IDE coding conventions | CWD only |
|
||||
| **.cursor/rules/*.mdc** | Cursor IDE rule modules | CWD only |
|
||||
|
||||
:::info Priority system
|
||||
Only **one** project context type is loaded per session (first match wins): `.hermes.md` → `AGENTS.md` → `CLAUDE.md` → `.cursorrules`. **SOUL.md** is always loaded independently as the agent identity (slot #1).
|
||||
:::
|
||||
|
||||
## AGENTS.md
|
||||
|
||||
`AGENTS.md` is the primary project context file. It tells the agent how your project is structured, what conventions to follow, and any special instructions.
|
||||
|
||||
### Hierarchical Discovery
|
||||
|
||||
Hermes walks the directory tree starting from the working directory and loads **all** `AGENTS.md` files found, sorted by depth. This supports monorepo-style setups:
|
||||
|
||||
```
|
||||
my-project/
|
||||
├── AGENTS.md ← Top-level project context
|
||||
├── frontend/
|
||||
│ └── AGENTS.md ← Frontend-specific instructions
|
||||
├── backend/
|
||||
│ └── AGENTS.md ← Backend-specific instructions
|
||||
└── shared/
|
||||
└── AGENTS.md ← Shared library conventions
|
||||
```
|
||||
|
||||
All four files are concatenated into a single context block with relative path headers.
|
||||
|
||||
:::info
|
||||
Directories that are skipped during the walk: `.`-prefixed dirs, `node_modules`, `__pycache__`, `venv`, `.venv`.
|
||||
:::
|
||||
|
||||
### Example AGENTS.md
|
||||
|
||||
```markdown
|
||||
# Project Context
|
||||
|
||||
This is a Next.js 14 web application with a Python FastAPI backend.
|
||||
|
||||
## Architecture
|
||||
- Frontend: Next.js 14 with App Router in `/frontend`
|
||||
- Backend: FastAPI in `/backend`, uses SQLAlchemy ORM
|
||||
- Database: PostgreSQL 16
|
||||
- Deployment: Docker Compose on a Hetzner VPS
|
||||
|
||||
## Conventions
|
||||
- Use TypeScript strict mode for all frontend code
|
||||
- Python code follows PEP 8, use type hints everywhere
|
||||
- All API endpoints return JSON with `{data, error, meta}` shape
|
||||
- Tests go in `__tests__/` directories (frontend) or `tests/` (backend)
|
||||
|
||||
## Important Notes
|
||||
- Never modify migration files directly — use Alembic commands
|
||||
- The `.env.local` file has real API keys, don't commit it
|
||||
- Frontend port is 3000, backend is 8000, DB is 5432
|
||||
```
|
||||
|
||||
## SOUL.md
|
||||
|
||||
`SOUL.md` controls the agent's personality, tone, and communication style. See the [Personality](/docs/user-guide/features/personality) page for full details.
|
||||
|
||||
**Location:**
|
||||
|
||||
- `~/.hermes/SOUL.md`
|
||||
- or `$HERMES_HOME/SOUL.md` if you run Hermes with a custom home directory
|
||||
|
||||
Important details:
|
||||
|
||||
- Hermes seeds a default `SOUL.md` automatically if one does not exist yet
|
||||
- Hermes loads `SOUL.md` only from `HERMES_HOME`
|
||||
- Hermes does not probe the working directory for `SOUL.md`
|
||||
- If the file is empty, nothing from `SOUL.md` is added to the prompt
|
||||
- If the file has content, the content is injected verbatim after scanning and truncation
|
||||
|
||||
## .cursorrules
|
||||
|
||||
Hermes is compatible with Cursor IDE's `.cursorrules` file and `.cursor/rules/*.mdc` rule modules. If these files exist in your project root and no higher-priority context file (`.hermes.md`, `AGENTS.md`, or `CLAUDE.md`) is found, they're loaded as the project context.
|
||||
|
||||
This means your existing Cursor conventions automatically apply when using Hermes.
|
||||
|
||||
## How Context Files Are Loaded
|
||||
|
||||
Context files are loaded by `build_context_files_prompt()` in `agent/prompt_builder.py`:
|
||||
|
||||
1. **At session start** — the function scans the working directory
|
||||
2. **Content is read** — each file is read as UTF-8 text
|
||||
3. **Security scan** — content is checked for prompt injection patterns
|
||||
4. **Truncation** — files exceeding 20,000 characters are head/tail truncated (70% head, 20% tail, with a marker in the middle)
|
||||
5. **Assembly** — all sections are combined under a `# Project Context` header
|
||||
6. **Injection** — the assembled content is added to the system prompt
|
||||
|
||||
The final prompt section looks roughly like:
|
||||
|
||||
```text
|
||||
# Project Context
|
||||
|
||||
The following project context files have been loaded and should be followed:
|
||||
|
||||
## AGENTS.md
|
||||
|
||||
[Your AGENTS.md content here]
|
||||
|
||||
## .cursorrules
|
||||
|
||||
[Your .cursorrules content here]
|
||||
|
||||
[Your SOUL.md content here]
|
||||
```
|
||||
|
||||
Notice that SOUL content is inserted directly, without extra wrapper text.
|
||||
|
||||
## Security: Prompt Injection Protection
|
||||
|
||||
All context files are scanned for potential prompt injection before being included. The scanner checks for:
|
||||
|
||||
- **Instruction override attempts**: "ignore previous instructions", "disregard your rules"
|
||||
- **Deception patterns**: "do not tell the user"
|
||||
- **System prompt overrides**: "system prompt override"
|
||||
- **Hidden HTML comments**: `<!-- ignore instructions -->`
|
||||
- **Hidden div elements**: `<div style="display:none">`
|
||||
- **Credential exfiltration**: `curl ... $API_KEY`
|
||||
- **Secret file access**: `cat .env`, `cat credentials`
|
||||
- **Invisible characters**: zero-width spaces, bidirectional overrides, word joiners
|
||||
|
||||
If any threat pattern is detected, the file is blocked:
|
||||
|
||||
```
|
||||
[BLOCKED: AGENTS.md contained potential prompt injection (prompt_injection). Content not loaded.]
|
||||
```
|
||||
|
||||
:::warning
|
||||
This scanner protects against common injection patterns, but it's not a substitute for reviewing context files in shared repositories. Always validate AGENTS.md content in projects you didn't author.
|
||||
:::
|
||||
|
||||
## Size Limits
|
||||
|
||||
| Limit | Value |
|
||||
|-------|-------|
|
||||
| Max chars per file | 20,000 (~7,000 tokens) |
|
||||
| Head truncation ratio | 70% |
|
||||
| Tail truncation ratio | 20% |
|
||||
| Truncation marker | 10% (shows char counts and suggests using file tools) |
|
||||
|
||||
When a file exceeds 20,000 characters, the truncation message reads:
|
||||
|
||||
```
|
||||
[...truncated AGENTS.md: kept 14000+4000 of 25000 chars. Use file tools to read the full file.]
|
||||
```
|
||||
|
||||
## Tips for Effective Context Files
|
||||
|
||||
:::tip Best practices for AGENTS.md
|
||||
1. **Keep it concise** — stay well under 20K chars; the agent reads it every turn
|
||||
2. **Structure with headers** — use `##` sections for architecture, conventions, important notes
|
||||
3. **Include concrete examples** — show preferred code patterns, API shapes, naming conventions
|
||||
4. **Mention what NOT to do** — "never modify migration files directly"
|
||||
5. **List key paths and ports** — the agent uses these for terminal commands
|
||||
6. **Update as the project evolves** — stale context is worse than no context
|
||||
:::
|
||||
|
||||
### Per-Subdirectory Context
|
||||
|
||||
For monorepos, put subdirectory-specific instructions in nested AGENTS.md files:
|
||||
|
||||
```markdown
|
||||
<!-- frontend/AGENTS.md -->
|
||||
# Frontend Context
|
||||
|
||||
- Use `pnpm` not `npm` for package management
|
||||
- Components go in `src/components/`, pages in `src/app/`
|
||||
- Use Tailwind CSS, never inline styles
|
||||
- Run tests with `pnpm test`
|
||||
```
|
||||
|
||||
```markdown
|
||||
<!-- backend/AGENTS.md -->
|
||||
# Backend Context
|
||||
|
||||
- Use `poetry` for dependency management
|
||||
- Run the dev server with `poetry run uvicorn main:app --reload`
|
||||
- All endpoints need OpenAPI docstrings
|
||||
- Database models are in `models/`, schemas in `schemas/`
|
||||
```
|
||||
|
|
@ -0,0 +1,109 @@
|
|||
---
|
||||
sidebar_position: 9
|
||||
title: "Context References"
|
||||
description: "Inline @-syntax for attaching files, folders, git diffs, and URLs directly into your messages"
|
||||
---
|
||||
|
||||
# Context References
|
||||
|
||||
Type `@` followed by a reference to inject content directly into your message. Hermes expands the reference inline and appends the content under an `--- Attached Context ---` section.
|
||||
|
||||
## Supported References
|
||||
|
||||
| Syntax | Description |
|
||||
|--------|-------------|
|
||||
| `@file:path/to/file.py` | Inject file contents |
|
||||
| `@file:path/to/file.py:10-25` | Inject specific line range (1-indexed, inclusive) |
|
||||
| `@folder:path/to/dir` | Inject directory tree listing with file metadata |
|
||||
| `@diff` | Inject `git diff` (unstaged working tree changes) |
|
||||
| `@staged` | Inject `git diff --staged` (staged changes) |
|
||||
| `@git:5` | Inject last N commits with patches (max 10) |
|
||||
| `@url:https://example.com` | Fetch and inject web page content |
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```text
|
||||
Review @file:src/main.py and suggest improvements
|
||||
|
||||
What changed? @diff
|
||||
|
||||
Compare @file:old_config.yaml and @file:new_config.yaml
|
||||
|
||||
What's in @folder:src/components?
|
||||
|
||||
Summarize this article @url:https://arxiv.org/abs/2301.00001
|
||||
```
|
||||
|
||||
Multiple references work in a single message:
|
||||
|
||||
```text
|
||||
Check @file:main.py, and also @file:test.py.
|
||||
```
|
||||
|
||||
Trailing punctuation (`,`, `.`, `;`, `!`, `?`) is automatically stripped from reference values.
|
||||
|
||||
## CLI Tab Completion
|
||||
|
||||
In the interactive CLI, typing `@` triggers autocomplete:
|
||||
|
||||
- `@` shows all reference types (`@diff`, `@staged`, `@file:`, `@folder:`, `@git:`, `@url:`)
|
||||
- `@file:` and `@folder:` trigger filesystem path completion with file size metadata
|
||||
- Bare `@` followed by partial text shows matching files and folders from the current directory
|
||||
|
||||
## Line Ranges
|
||||
|
||||
The `@file:` reference supports line ranges for precise content injection:
|
||||
|
||||
```text
|
||||
@file:src/main.py:42 # Single line 42
|
||||
@file:src/main.py:10-25 # Lines 10 through 25 (inclusive)
|
||||
```
|
||||
|
||||
Lines are 1-indexed. Invalid ranges are silently ignored (full file is returned).
|
||||
|
||||
## Size Limits
|
||||
|
||||
Context references are bounded to prevent overwhelming the model's context window:
|
||||
|
||||
| Threshold | Value | Behavior |
|
||||
|-----------|-------|----------|
|
||||
| Soft limit | 25% of context length | Warning appended, expansion proceeds |
|
||||
| Hard limit | 50% of context length | Expansion refused, original message returned unchanged |
|
||||
| Folder entries | 200 files max | Excess entries replaced with `- ...` |
|
||||
| Git commits | 10 max | `@git:N` clamped to range [1, 10] |
|
||||
|
||||
## Security
|
||||
|
||||
### Sensitive Path Blocking
|
||||
|
||||
These paths are always blocked from `@file:` references to prevent credential exposure:
|
||||
|
||||
- SSH keys and config: `~/.ssh/id_rsa`, `~/.ssh/id_ed25519`, `~/.ssh/authorized_keys`, `~/.ssh/config`
|
||||
- Shell profiles: `~/.bashrc`, `~/.zshrc`, `~/.profile`, `~/.bash_profile`, `~/.zprofile`
|
||||
- Credential files: `~/.netrc`, `~/.pgpass`, `~/.npmrc`, `~/.pypirc`
|
||||
- Hermes env: `$HERMES_HOME/.env`
|
||||
|
||||
These directories are fully blocked (any file inside):
|
||||
- `~/.ssh/`, `~/.aws/`, `~/.gnupg/`, `~/.kube/`, `$HERMES_HOME/skills/.hub/`
|
||||
|
||||
### Path Traversal Protection
|
||||
|
||||
All paths are resolved relative to the working directory. References that resolve outside the allowed workspace root are rejected.
|
||||
|
||||
### Binary File Detection
|
||||
|
||||
Binary files are detected via MIME type and null-byte scanning. Known text extensions (`.py`, `.md`, `.json`, `.yaml`, `.toml`, `.js`, `.ts`, etc.) bypass MIME-based detection. Binary files are rejected with a warning.
|
||||
|
||||
## Error Handling
|
||||
|
||||
Invalid references produce inline warnings rather than failures:
|
||||
|
||||
| Condition | Behavior |
|
||||
|-----------|----------|
|
||||
| File not found | Warning: "file not found" |
|
||||
| Binary file | Warning: "binary files are not supported" |
|
||||
| Folder not found | Warning: "folder not found" |
|
||||
| Git command fails | Warning with git stderr |
|
||||
| URL returns no content | Warning: "no content extracted" |
|
||||
| Sensitive path | Warning: "path is a sensitive credential file" |
|
||||
| Path outside workspace | Warning: "path is outside the allowed workspace" |
|
||||
285
hermes_code/website/docs/user-guide/features/cron.md
Normal file
285
hermes_code/website/docs/user-guide/features/cron.md
Normal file
|
|
@ -0,0 +1,285 @@
|
|||
---
|
||||
sidebar_position: 5
|
||||
title: "Scheduled Tasks (Cron)"
|
||||
description: "Schedule automated tasks with natural language, manage them with one cron tool, and attach one or more skills"
|
||||
---
|
||||
|
||||
# Scheduled Tasks (Cron)
|
||||
|
||||
Schedule tasks to run automatically with natural language or cron expressions. Hermes exposes cron management through a single `cronjob` tool with action-style operations instead of separate schedule/list/remove tools.
|
||||
|
||||
## What cron can do now
|
||||
|
||||
Cron jobs can:
|
||||
|
||||
- schedule one-shot or recurring tasks
|
||||
- pause, resume, edit, trigger, and remove jobs
|
||||
- attach zero, one, or multiple skills to a job
|
||||
- deliver results back to the origin chat, local files, or configured platform targets
|
||||
- run in fresh agent sessions with the normal static tool list
|
||||
|
||||
:::warning
|
||||
Cron-run sessions cannot recursively create more cron jobs. Hermes disables cron management tools inside cron executions to prevent runaway scheduling loops.
|
||||
:::
|
||||
|
||||
## Creating scheduled tasks
|
||||
|
||||
### In chat with `/cron`
|
||||
|
||||
```bash
|
||||
/cron add 30m "Remind me to check the build"
|
||||
/cron add "every 2h" "Check server status"
|
||||
/cron add "every 1h" "Summarize new feed items" --skill blogwatcher
|
||||
/cron add "every 1h" "Use both skills and combine the result" --skill blogwatcher --skill find-nearby
|
||||
```
|
||||
|
||||
### From the standalone CLI
|
||||
|
||||
```bash
|
||||
hermes cron create "every 2h" "Check server status"
|
||||
hermes cron create "every 1h" "Summarize new feed items" --skill blogwatcher
|
||||
hermes cron create "every 1h" "Use both skills and combine the result" \
|
||||
--skill blogwatcher \
|
||||
--skill find-nearby \
|
||||
--name "Skill combo"
|
||||
```
|
||||
|
||||
### Through natural conversation
|
||||
|
||||
Ask Hermes normally:
|
||||
|
||||
```text
|
||||
Every morning at 9am, check Hacker News for AI news and send me a summary on Telegram.
|
||||
```
|
||||
|
||||
Hermes will use the unified `cronjob` tool internally.
|
||||
|
||||
## Skill-backed cron jobs
|
||||
|
||||
A cron job can load one or more skills before it runs the prompt.
|
||||
|
||||
### Single skill
|
||||
|
||||
```python
|
||||
cronjob(
|
||||
action="create",
|
||||
skill="blogwatcher",
|
||||
prompt="Check the configured feeds and summarize anything new.",
|
||||
schedule="0 9 * * *",
|
||||
name="Morning feeds",
|
||||
)
|
||||
```
|
||||
|
||||
### Multiple skills
|
||||
|
||||
Skills are loaded in order. The prompt becomes the task instruction layered on top of those skills.
|
||||
|
||||
```python
|
||||
cronjob(
|
||||
action="create",
|
||||
skills=["blogwatcher", "find-nearby"],
|
||||
prompt="Look for new local events and interesting nearby places, then combine them into one short brief.",
|
||||
schedule="every 6h",
|
||||
name="Local brief",
|
||||
)
|
||||
```
|
||||
|
||||
This is useful when you want a scheduled agent to inherit reusable workflows without stuffing the full skill text into the cron prompt itself.
|
||||
|
||||
## Editing jobs
|
||||
|
||||
You do not need to delete and recreate jobs just to change them.
|
||||
|
||||
### Chat
|
||||
|
||||
```bash
|
||||
/cron edit <job_id> --schedule "every 4h"
|
||||
/cron edit <job_id> --prompt "Use the revised task"
|
||||
/cron edit <job_id> --skill blogwatcher --skill find-nearby
|
||||
/cron edit <job_id> --remove-skill blogwatcher
|
||||
/cron edit <job_id> --clear-skills
|
||||
```
|
||||
|
||||
### Standalone CLI
|
||||
|
||||
```bash
|
||||
hermes cron edit <job_id> --schedule "every 4h"
|
||||
hermes cron edit <job_id> --prompt "Use the revised task"
|
||||
hermes cron edit <job_id> --skill blogwatcher --skill find-nearby
|
||||
hermes cron edit <job_id> --add-skill find-nearby
|
||||
hermes cron edit <job_id> --remove-skill blogwatcher
|
||||
hermes cron edit <job_id> --clear-skills
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- repeated `--skill` replaces the job's attached skill list
|
||||
- `--add-skill` appends to the existing list without replacing it
|
||||
- `--remove-skill` removes specific attached skills
|
||||
- `--clear-skills` removes all attached skills
|
||||
|
||||
## Lifecycle actions
|
||||
|
||||
Cron jobs now have a fuller lifecycle than just create/remove.
|
||||
|
||||
### Chat
|
||||
|
||||
```bash
|
||||
/cron list
|
||||
/cron pause <job_id>
|
||||
/cron resume <job_id>
|
||||
/cron run <job_id>
|
||||
/cron remove <job_id>
|
||||
```
|
||||
|
||||
### Standalone CLI
|
||||
|
||||
```bash
|
||||
hermes cron list
|
||||
hermes cron pause <job_id>
|
||||
hermes cron resume <job_id>
|
||||
hermes cron run <job_id>
|
||||
hermes cron remove <job_id>
|
||||
hermes cron status
|
||||
hermes cron tick
|
||||
```
|
||||
|
||||
What they do:
|
||||
|
||||
- `pause` — keep the job but stop scheduling it
|
||||
- `resume` — re-enable the job and compute the next future run
|
||||
- `run` — trigger the job on the next scheduler tick
|
||||
- `remove` — delete it entirely
|
||||
|
||||
## How it works
|
||||
|
||||
**Cron execution is handled by the gateway daemon.** The gateway ticks the scheduler every 60 seconds, running any due jobs in isolated agent sessions.
|
||||
|
||||
```bash
|
||||
hermes gateway install # Install as a user service
|
||||
sudo hermes gateway install --system # Linux: boot-time system service for servers
|
||||
hermes gateway # Or run in foreground
|
||||
|
||||
hermes cron list
|
||||
hermes cron status
|
||||
```
|
||||
|
||||
### Gateway scheduler behavior
|
||||
|
||||
On each tick Hermes:
|
||||
|
||||
1. loads jobs from `~/.hermes/cron/jobs.json`
|
||||
2. checks `next_run_at` against the current time
|
||||
3. starts a fresh `AIAgent` session for each due job
|
||||
4. optionally injects one or more attached skills into that fresh session
|
||||
5. runs the prompt to completion
|
||||
6. delivers the final response
|
||||
7. updates run metadata and the next scheduled time
|
||||
|
||||
A file lock at `~/.hermes/cron/.tick.lock` prevents overlapping scheduler ticks from double-running the same job batch.
|
||||
|
||||
## Delivery options
|
||||
|
||||
When scheduling jobs, you specify where the output goes:
|
||||
|
||||
| Option | Description | Example |
|
||||
|--------|-------------|---------|
|
||||
| `"origin"` | Back to where the job was created | Default on messaging platforms |
|
||||
| `"local"` | Save to local files only (`~/.hermes/cron/output/`) | Default on CLI |
|
||||
| `"telegram"` | Telegram home channel | Uses `TELEGRAM_HOME_CHANNEL` |
|
||||
| `"discord"` | Discord home channel | Uses `DISCORD_HOME_CHANNEL` |
|
||||
| `"telegram:123456"` | Specific Telegram chat by ID | Direct delivery |
|
||||
| `"discord:987654"` | Specific Discord channel by ID | Direct delivery |
|
||||
|
||||
The agent's final response is automatically delivered. You do not need to call `send_message` in the cron prompt.
|
||||
|
||||
## Schedule formats
|
||||
|
||||
The agent's final response is automatically delivered — you do **not** need to include `send_message` in the cron prompt for that same destination. If a cron run calls `send_message` to the exact target the scheduler will already deliver to, Hermes skips that duplicate send and tells the model to put the user-facing content in the final response instead. Use `send_message` only for additional or different targets.
|
||||
|
||||
### Relative delays (one-shot)
|
||||
|
||||
```text
|
||||
30m → Run once in 30 minutes
|
||||
2h → Run once in 2 hours
|
||||
1d → Run once in 1 day
|
||||
```
|
||||
|
||||
### Intervals (recurring)
|
||||
|
||||
```text
|
||||
every 30m → Every 30 minutes
|
||||
every 2h → Every 2 hours
|
||||
every 1d → Every day
|
||||
```
|
||||
|
||||
### Cron expressions
|
||||
|
||||
```text
|
||||
0 9 * * * → Daily at 9:00 AM
|
||||
0 9 * * 1-5 → Weekdays at 9:00 AM
|
||||
0 */6 * * * → Every 6 hours
|
||||
30 8 1 * * → First of every month at 8:30 AM
|
||||
0 0 * * 0 → Every Sunday at midnight
|
||||
```
|
||||
|
||||
### ISO timestamps
|
||||
|
||||
```text
|
||||
2026-03-15T09:00:00 → One-time at March 15, 2026 9:00 AM
|
||||
```
|
||||
|
||||
## Repeat behavior
|
||||
|
||||
| Schedule type | Default repeat | Behavior |
|
||||
|--------------|----------------|----------|
|
||||
| One-shot (`30m`, timestamp) | 1 | Runs once |
|
||||
| Interval (`every 2h`) | forever | Runs until removed |
|
||||
| Cron expression | forever | Runs until removed |
|
||||
|
||||
You can override it:
|
||||
|
||||
```python
|
||||
cronjob(
|
||||
action="create",
|
||||
prompt="...",
|
||||
schedule="every 2h",
|
||||
repeat=5,
|
||||
)
|
||||
```
|
||||
|
||||
## Managing jobs programmatically
|
||||
|
||||
The agent-facing API is one tool:
|
||||
|
||||
```python
|
||||
cronjob(action="create", ...)
|
||||
cronjob(action="list")
|
||||
cronjob(action="update", job_id="...")
|
||||
cronjob(action="pause", job_id="...")
|
||||
cronjob(action="resume", job_id="...")
|
||||
cronjob(action="run", job_id="...")
|
||||
cronjob(action="remove", job_id="...")
|
||||
```
|
||||
|
||||
For `update`, pass `skills=[]` to remove all attached skills.
|
||||
|
||||
## Job storage
|
||||
|
||||
Jobs are stored in `~/.hermes/cron/jobs.json`. Output from job runs is saved to `~/.hermes/cron/output/{job_id}/{timestamp}.md`.
|
||||
|
||||
The storage uses atomic file writes so interrupted writes do not leave a partially written job file behind.
|
||||
|
||||
## Self-contained prompts still matter
|
||||
|
||||
:::warning Important
|
||||
Cron jobs run in a completely fresh agent session. The prompt must contain everything the agent needs that is not already provided by attached skills.
|
||||
:::
|
||||
|
||||
**BAD:** `"Check on that server issue"`
|
||||
|
||||
**GOOD:** `"SSH into server 192.168.1.100 as user 'deploy', check if nginx is running with 'systemctl status nginx', and verify https://example.com returns HTTP 200."`
|
||||
|
||||
## Security
|
||||
|
||||
Scheduled task prompts are scanned for prompt-injection and credential-exfiltration patterns at creation and update time. Prompts containing invisible Unicode tricks, SSH backdoor attempts, or obvious secret-exfiltration payloads are blocked.
|
||||
222
hermes_code/website/docs/user-guide/features/delegation.md
Normal file
222
hermes_code/website/docs/user-guide/features/delegation.md
Normal file
|
|
@ -0,0 +1,222 @@
|
|||
---
|
||||
sidebar_position: 7
|
||||
title: "Subagent Delegation"
|
||||
description: "Spawn isolated child agents for parallel workstreams with delegate_task"
|
||||
---
|
||||
|
||||
# Subagent Delegation
|
||||
|
||||
The `delegate_task` tool spawns child AIAgent instances with isolated context, restricted toolsets, and their own terminal sessions. Each child gets a fresh conversation and works independently — only its final summary enters the parent's context.
|
||||
|
||||
## Single Task
|
||||
|
||||
```python
|
||||
delegate_task(
|
||||
goal="Debug why tests fail",
|
||||
context="Error: assertion in test_foo.py line 42",
|
||||
toolsets=["terminal", "file"]
|
||||
)
|
||||
```
|
||||
|
||||
## Parallel Batch
|
||||
|
||||
Up to 3 concurrent subagents:
|
||||
|
||||
```python
|
||||
delegate_task(tasks=[
|
||||
{"goal": "Research topic A", "toolsets": ["web"]},
|
||||
{"goal": "Research topic B", "toolsets": ["web"]},
|
||||
{"goal": "Fix the build", "toolsets": ["terminal", "file"]}
|
||||
])
|
||||
```
|
||||
|
||||
## How Subagent Context Works
|
||||
|
||||
:::warning Critical: Subagents Know Nothing
|
||||
Subagents start with a **completely fresh conversation**. They have zero knowledge of the parent's conversation history, prior tool calls, or anything discussed before delegation. The subagent's only context comes from the `goal` and `context` fields you provide.
|
||||
:::
|
||||
|
||||
This means you must pass **everything** the subagent needs:
|
||||
|
||||
```python
|
||||
# BAD - subagent has no idea what "the error" is
|
||||
delegate_task(goal="Fix the error")
|
||||
|
||||
# GOOD - subagent has all context it needs
|
||||
delegate_task(
|
||||
goal="Fix the TypeError in api/handlers.py",
|
||||
context="""The file api/handlers.py has a TypeError on line 47:
|
||||
'NoneType' object has no attribute 'get'.
|
||||
The function process_request() receives a dict from parse_body(),
|
||||
but parse_body() returns None when Content-Type is missing.
|
||||
The project is at /home/user/myproject and uses Python 3.11."""
|
||||
)
|
||||
```
|
||||
|
||||
The subagent receives a focused system prompt built from your goal and context, instructing it to complete the task and provide a structured summary of what it did, what it found, any files modified, and any issues encountered.
|
||||
|
||||
## Practical Examples
|
||||
|
||||
### Parallel Research
|
||||
|
||||
Research multiple topics simultaneously and collect summaries:
|
||||
|
||||
```python
|
||||
delegate_task(tasks=[
|
||||
{
|
||||
"goal": "Research the current state of WebAssembly in 2025",
|
||||
"context": "Focus on: browser support, non-browser runtimes, language support",
|
||||
"toolsets": ["web"]
|
||||
},
|
||||
{
|
||||
"goal": "Research the current state of RISC-V adoption in 2025",
|
||||
"context": "Focus on: server chips, embedded systems, software ecosystem",
|
||||
"toolsets": ["web"]
|
||||
},
|
||||
{
|
||||
"goal": "Research quantum computing progress in 2025",
|
||||
"context": "Focus on: error correction breakthroughs, practical applications, key players",
|
||||
"toolsets": ["web"]
|
||||
}
|
||||
])
|
||||
```
|
||||
|
||||
### Code Review + Fix
|
||||
|
||||
Delegate a review-and-fix workflow to a fresh context:
|
||||
|
||||
```python
|
||||
delegate_task(
|
||||
goal="Review the authentication module for security issues and fix any found",
|
||||
context="""Project at /home/user/webapp.
|
||||
Auth module files: src/auth/login.py, src/auth/jwt.py, src/auth/middleware.py.
|
||||
The project uses Flask, PyJWT, and bcrypt.
|
||||
Focus on: SQL injection, JWT validation, password handling, session management.
|
||||
Fix any issues found and run the test suite (pytest tests/auth/).""",
|
||||
toolsets=["terminal", "file"]
|
||||
)
|
||||
```
|
||||
|
||||
### Multi-File Refactoring
|
||||
|
||||
Delegate a large refactoring task that would flood the parent's context:
|
||||
|
||||
```python
|
||||
delegate_task(
|
||||
goal="Refactor all Python files in src/ to replace print() with proper logging",
|
||||
context="""Project at /home/user/myproject.
|
||||
Use the 'logging' module with logger = logging.getLogger(__name__).
|
||||
Replace print() calls with appropriate log levels:
|
||||
- print(f"Error: ...") -> logger.error(...)
|
||||
- print(f"Warning: ...") -> logger.warning(...)
|
||||
- print(f"Debug: ...") -> logger.debug(...)
|
||||
- Other prints -> logger.info(...)
|
||||
Don't change print() in test files or CLI output.
|
||||
Run pytest after to verify nothing broke.""",
|
||||
toolsets=["terminal", "file"]
|
||||
)
|
||||
```
|
||||
|
||||
## Batch Mode Details
|
||||
|
||||
When you provide a `tasks` array, subagents run in **parallel** using a thread pool:
|
||||
|
||||
- **Maximum concurrency:** 3 tasks (the `tasks` array is truncated to 3 if longer)
|
||||
- **Thread pool:** Uses `ThreadPoolExecutor` with `MAX_CONCURRENT_CHILDREN = 3` workers
|
||||
- **Progress display:** In CLI mode, a tree-view shows tool calls from each subagent in real-time with per-task completion lines. In gateway mode, progress is batched and relayed to the parent's progress callback
|
||||
- **Result ordering:** Results are sorted by task index to match input order regardless of completion order
|
||||
- **Interrupt propagation:** Interrupting the parent (e.g., sending a new message) interrupts all active children
|
||||
|
||||
Single-task delegation runs directly without thread pool overhead.
|
||||
|
||||
## Model Override
|
||||
|
||||
You can configure a different model for subagents via `config.yaml` — useful for delegating simple tasks to cheaper/faster models:
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
delegation:
|
||||
model: "google/gemini-flash-2.0" # Cheaper model for subagents
|
||||
provider: "openrouter" # Optional: route subagents to a different provider
|
||||
```
|
||||
|
||||
If omitted, subagents use the same model as the parent.
|
||||
|
||||
## Toolset Selection Tips
|
||||
|
||||
The `toolsets` parameter controls what tools the subagent has access to. Choose based on the task:
|
||||
|
||||
| Toolset Pattern | Use Case |
|
||||
|----------------|----------|
|
||||
| `["terminal", "file"]` | Code work, debugging, file editing, builds |
|
||||
| `["web"]` | Research, fact-checking, documentation lookup |
|
||||
| `["terminal", "file", "web"]` | Full-stack tasks (default) |
|
||||
| `["file"]` | Read-only analysis, code review without execution |
|
||||
| `["terminal"]` | System administration, process management |
|
||||
|
||||
Certain toolsets are **always blocked** for subagents regardless of what you specify:
|
||||
- `delegation` — no recursive delegation (prevents infinite spawning)
|
||||
- `clarify` — subagents cannot interact with the user
|
||||
- `memory` — no writes to shared persistent memory
|
||||
- `code_execution` — children should reason step-by-step
|
||||
- `send_message` — no cross-platform side effects (e.g., sending Telegram messages)
|
||||
|
||||
## Max Iterations
|
||||
|
||||
Each subagent has an iteration limit (default: 50) that controls how many tool-calling turns it can take:
|
||||
|
||||
```python
|
||||
delegate_task(
|
||||
goal="Quick file check",
|
||||
context="Check if /etc/nginx/nginx.conf exists and print its first 10 lines",
|
||||
max_iterations=10 # Simple task, don't need many turns
|
||||
)
|
||||
```
|
||||
|
||||
## Depth Limit
|
||||
|
||||
Delegation has a **depth limit of 2** — a parent (depth 0) can spawn children (depth 1), but children cannot delegate further. This prevents runaway recursive delegation chains.
|
||||
|
||||
## Key Properties
|
||||
|
||||
- Each subagent gets its **own terminal session** (separate from the parent)
|
||||
- **No nested delegation** — children cannot delegate further (no grandchildren)
|
||||
- Subagents **cannot** call: `delegate_task`, `clarify`, `memory`, `send_message`, `execute_code`
|
||||
- **Interrupt propagation** — interrupting the parent interrupts all active children
|
||||
- Only the final summary enters the parent's context, keeping token usage efficient
|
||||
- Subagents inherit the parent's **API key and provider configuration**
|
||||
|
||||
## Delegation vs execute_code
|
||||
|
||||
| Factor | delegate_task | execute_code |
|
||||
|--------|--------------|-------------|
|
||||
| **Reasoning** | Full LLM reasoning loop | Just Python code execution |
|
||||
| **Context** | Fresh isolated conversation | No conversation, just script |
|
||||
| **Tool access** | All non-blocked tools with reasoning | 7 tools via RPC, no reasoning |
|
||||
| **Parallelism** | Up to 3 concurrent subagents | Single script |
|
||||
| **Best for** | Complex tasks needing judgment | Mechanical multi-step pipelines |
|
||||
| **Token cost** | Higher (full LLM loop) | Lower (only stdout returned) |
|
||||
| **User interaction** | None (subagents can't clarify) | None |
|
||||
|
||||
**Rule of thumb:** Use `delegate_task` when the subtask requires reasoning, judgment, or multi-step problem solving. Use `execute_code` when you need mechanical data processing or scripted workflows.
|
||||
|
||||
## Configuration
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
delegation:
|
||||
max_iterations: 50 # Max turns per child (default: 50)
|
||||
default_toolsets: ["terminal", "file", "web"] # Default toolsets
|
||||
model: "google/gemini-3-flash-preview" # Optional provider/model override
|
||||
provider: "openrouter" # Optional built-in provider
|
||||
|
||||
# Or use a direct custom endpoint instead of provider:
|
||||
delegation:
|
||||
model: "qwen2.5-coder"
|
||||
base_url: "http://localhost:1234/v1"
|
||||
api_key: "local-key"
|
||||
```
|
||||
|
||||
:::tip
|
||||
The agent handles delegation automatically based on the task complexity. You don't need to explicitly ask it to delegate — it will do so when it makes sense.
|
||||
:::
|
||||
|
|
@ -0,0 +1,323 @@
|
|||
---
|
||||
title: Fallback Providers
|
||||
description: Configure automatic failover to backup LLM providers when your primary model is unavailable.
|
||||
sidebar_label: Fallback Providers
|
||||
sidebar_position: 8
|
||||
---
|
||||
|
||||
# Fallback Providers
|
||||
|
||||
Hermes Agent has two separate fallback systems that keep your sessions running when providers hit issues:
|
||||
|
||||
1. **Primary model fallback** — automatically switches to a backup provider:model when your main model fails
|
||||
2. **Auxiliary task fallback** — independent provider resolution for side tasks like vision, compression, and web extraction
|
||||
|
||||
Both are optional and work independently.
|
||||
|
||||
## Primary Model Fallback
|
||||
|
||||
When your main LLM provider encounters errors — rate limits, server overload, auth failures, connection drops — Hermes can automatically switch to a backup provider:model pair mid-session without losing your conversation.
|
||||
|
||||
### Configuration
|
||||
|
||||
Add a `fallback_model` section to `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
fallback_model:
|
||||
provider: openrouter
|
||||
model: anthropic/claude-sonnet-4
|
||||
```
|
||||
|
||||
Both `provider` and `model` are **required**. If either is missing, the fallback is disabled.
|
||||
|
||||
### Supported Providers
|
||||
|
||||
| Provider | Value | Requirements |
|
||||
|----------|-------|-------------|
|
||||
| AI Gateway | `ai-gateway` | `AI_GATEWAY_API_KEY` |
|
||||
| OpenRouter | `openrouter` | `OPENROUTER_API_KEY` |
|
||||
| Nous Portal | `nous` | `hermes login` (OAuth) |
|
||||
| OpenAI Codex | `openai-codex` | `hermes model` (ChatGPT OAuth) |
|
||||
| Anthropic | `anthropic` | `ANTHROPIC_API_KEY` or Claude Code credentials |
|
||||
| z.ai / GLM | `zai` | `GLM_API_KEY` |
|
||||
| Kimi / Moonshot | `kimi-coding` | `KIMI_API_KEY` |
|
||||
| MiniMax | `minimax` | `MINIMAX_API_KEY` |
|
||||
| MiniMax (China) | `minimax-cn` | `MINIMAX_CN_API_KEY` |
|
||||
| Kilo Code | `kilocode` | `KILOCODE_API_KEY` |
|
||||
| Custom endpoint | `custom` | `base_url` + `api_key_env` (see below) |
|
||||
|
||||
### Custom Endpoint Fallback
|
||||
|
||||
For a custom OpenAI-compatible endpoint, add `base_url` and optionally `api_key_env`:
|
||||
|
||||
```yaml
|
||||
fallback_model:
|
||||
provider: custom
|
||||
model: my-local-model
|
||||
base_url: http://localhost:8000/v1
|
||||
api_key_env: MY_LOCAL_KEY # env var name containing the API key
|
||||
```
|
||||
|
||||
### When Fallback Triggers
|
||||
|
||||
The fallback activates automatically when the primary model fails with:
|
||||
|
||||
- **Rate limits** (HTTP 429) — after exhausting retry attempts
|
||||
- **Server errors** (HTTP 500, 502, 503) — after exhausting retry attempts
|
||||
- **Auth failures** (HTTP 401, 403) — immediately (no point retrying)
|
||||
- **Not found** (HTTP 404) — immediately
|
||||
- **Invalid responses** — when the API returns malformed or empty responses repeatedly
|
||||
|
||||
When triggered, Hermes:
|
||||
|
||||
1. Resolves credentials for the fallback provider
|
||||
2. Builds a new API client
|
||||
3. Swaps the model, provider, and client in-place
|
||||
4. Resets the retry counter and continues the conversation
|
||||
|
||||
The switch is seamless — your conversation history, tool calls, and context are preserved. The agent continues from exactly where it left off, just using a different model.
|
||||
|
||||
:::info One-Shot
|
||||
Fallback activates **at most once** per session. If the fallback provider also fails, normal error handling takes over (retries, then error message). This prevents cascading failover loops.
|
||||
:::
|
||||
|
||||
### Examples
|
||||
|
||||
**OpenRouter as fallback for Anthropic native:**
|
||||
```yaml
|
||||
model:
|
||||
provider: anthropic
|
||||
default: claude-sonnet-4-6
|
||||
|
||||
fallback_model:
|
||||
provider: openrouter
|
||||
model: anthropic/claude-sonnet-4
|
||||
```
|
||||
|
||||
**Nous Portal as fallback for OpenRouter:**
|
||||
```yaml
|
||||
model:
|
||||
provider: openrouter
|
||||
default: anthropic/claude-opus-4
|
||||
|
||||
fallback_model:
|
||||
provider: nous
|
||||
model: nous-hermes-3
|
||||
```
|
||||
|
||||
**Local model as fallback for cloud:**
|
||||
```yaml
|
||||
fallback_model:
|
||||
provider: custom
|
||||
model: llama-3.1-70b
|
||||
base_url: http://localhost:8000/v1
|
||||
api_key_env: LOCAL_API_KEY
|
||||
```
|
||||
|
||||
**Codex OAuth as fallback:**
|
||||
```yaml
|
||||
fallback_model:
|
||||
provider: openai-codex
|
||||
model: gpt-5.3-codex
|
||||
```
|
||||
|
||||
### Where Fallback Works
|
||||
|
||||
| Context | Fallback Supported |
|
||||
|---------|-------------------|
|
||||
| CLI sessions | ✔ |
|
||||
| Messaging gateway (Telegram, Discord, etc.) | ✔ |
|
||||
| Subagent delegation | ✘ (subagents do not inherit fallback config) |
|
||||
| Cron jobs | ✘ (run with a fixed provider) |
|
||||
| Auxiliary tasks (vision, compression) | ✘ (use their own provider chain — see below) |
|
||||
|
||||
:::tip
|
||||
There are no environment variables for `fallback_model` — it is configured exclusively through `config.yaml`. This is intentional: fallback configuration is a deliberate choice, not something a stale shell export should override.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Auxiliary Task Fallback
|
||||
|
||||
Hermes uses separate lightweight models for side tasks. Each task has its own provider resolution chain that acts as a built-in fallback system.
|
||||
|
||||
### Tasks with Independent Provider Resolution
|
||||
|
||||
| Task | What It Does | Config Key |
|
||||
|------|-------------|-----------|
|
||||
| Vision | Image analysis, browser screenshots | `auxiliary.vision` |
|
||||
| Web Extract | Web page summarization | `auxiliary.web_extract` |
|
||||
| Compression | Context compression summaries | `auxiliary.compression` or `compression.summary_provider` |
|
||||
| Session Search | Past session summarization | `auxiliary.session_search` |
|
||||
| Skills Hub | Skill search and discovery | `auxiliary.skills_hub` |
|
||||
| MCP | MCP helper operations | `auxiliary.mcp` |
|
||||
| Memory Flush | Memory consolidation | `auxiliary.flush_memories` |
|
||||
|
||||
### Auto-Detection Chain
|
||||
|
||||
When a task's provider is set to `"auto"` (the default), Hermes tries providers in order until one works:
|
||||
|
||||
**For text tasks (compression, web extract, etc.):**
|
||||
|
||||
```text
|
||||
OpenRouter → Nous Portal → Custom endpoint → Codex OAuth →
|
||||
API-key providers (z.ai, Kimi, MiniMax, Anthropic) → give up
|
||||
```
|
||||
|
||||
**For vision tasks:**
|
||||
|
||||
```text
|
||||
Main provider (if vision-capable) → OpenRouter → Nous Portal →
|
||||
Codex OAuth → Anthropic → Custom endpoint → give up
|
||||
```
|
||||
|
||||
If the resolved provider fails at call time, Hermes also has an internal retry: if the provider is not OpenRouter and no explicit `base_url` is set, it tries OpenRouter as a last-resort fallback.
|
||||
|
||||
### Configuring Auxiliary Providers
|
||||
|
||||
Each task can be configured independently in `config.yaml`:
|
||||
|
||||
```yaml
|
||||
auxiliary:
|
||||
vision:
|
||||
provider: "auto" # auto | openrouter | nous | codex | main | anthropic
|
||||
model: "" # e.g. "openai/gpt-4o"
|
||||
base_url: "" # direct endpoint (takes precedence over provider)
|
||||
api_key: "" # API key for base_url
|
||||
|
||||
web_extract:
|
||||
provider: "auto"
|
||||
model: ""
|
||||
|
||||
compression:
|
||||
provider: "auto"
|
||||
model: ""
|
||||
|
||||
session_search:
|
||||
provider: "auto"
|
||||
model: ""
|
||||
|
||||
skills_hub:
|
||||
provider: "auto"
|
||||
model: ""
|
||||
|
||||
mcp:
|
||||
provider: "auto"
|
||||
model: ""
|
||||
|
||||
flush_memories:
|
||||
provider: "auto"
|
||||
model: ""
|
||||
```
|
||||
|
||||
Every task above follows the same **provider / model / base_url** pattern. Context compression uses its own top-level block:
|
||||
|
||||
```yaml
|
||||
compression:
|
||||
summary_provider: main # Same provider options as auxiliary tasks
|
||||
summary_model: google/gemini-3-flash-preview
|
||||
summary_base_url: null # Custom OpenAI-compatible endpoint
|
||||
```
|
||||
|
||||
And the fallback model uses:
|
||||
|
||||
```yaml
|
||||
fallback_model:
|
||||
provider: openrouter
|
||||
model: anthropic/claude-sonnet-4
|
||||
# base_url: http://localhost:8000/v1 # Optional custom endpoint
|
||||
```
|
||||
|
||||
All three — auxiliary, compression, fallback — work the same way: set `provider` to pick who handles the request, `model` to pick which model, and `base_url` to point at a custom endpoint (overrides provider).
|
||||
|
||||
### Provider Options for Auxiliary Tasks
|
||||
|
||||
| Provider | Description | Requirements |
|
||||
|----------|-------------|-------------|
|
||||
| `"auto"` | Try providers in order until one works (default) | At least one provider configured |
|
||||
| `"openrouter"` | Force OpenRouter | `OPENROUTER_API_KEY` |
|
||||
| `"nous"` | Force Nous Portal | `hermes login` |
|
||||
| `"codex"` | Force Codex OAuth | `hermes model` → Codex |
|
||||
| `"main"` | Use whatever provider the main agent uses | Active main provider configured |
|
||||
| `"anthropic"` | Force Anthropic native | `ANTHROPIC_API_KEY` or Claude Code credentials |
|
||||
|
||||
### Direct Endpoint Override
|
||||
|
||||
For any auxiliary task, setting `base_url` bypasses provider resolution entirely and sends requests directly to that endpoint:
|
||||
|
||||
```yaml
|
||||
auxiliary:
|
||||
vision:
|
||||
base_url: "http://localhost:1234/v1"
|
||||
api_key: "local-key"
|
||||
model: "qwen2.5-vl"
|
||||
```
|
||||
|
||||
`base_url` takes precedence over `provider`. Hermes uses the configured `api_key` for authentication, falling back to `OPENAI_API_KEY` if not set. It does **not** reuse `OPENROUTER_API_KEY` for custom endpoints.
|
||||
|
||||
---
|
||||
|
||||
## Context Compression Fallback
|
||||
|
||||
Context compression has a legacy configuration path in addition to the auxiliary system:
|
||||
|
||||
```yaml
|
||||
compression:
|
||||
summary_provider: "auto" # auto | openrouter | nous | main
|
||||
summary_model: "google/gemini-3-flash-preview"
|
||||
```
|
||||
|
||||
This is equivalent to configuring `auxiliary.compression.provider` and `auxiliary.compression.model`. If both are set, the `auxiliary.compression` values take precedence.
|
||||
|
||||
If no provider is available for compression, Hermes drops middle conversation turns without generating a summary rather than failing the session.
|
||||
|
||||
---
|
||||
|
||||
## Delegation Provider Override
|
||||
|
||||
Subagents spawned by `delegate_task` do **not** use the primary fallback model. However, they can be routed to a different provider:model pair for cost optimization:
|
||||
|
||||
```yaml
|
||||
delegation:
|
||||
provider: "openrouter" # override provider for all subagents
|
||||
model: "google/gemini-3-flash-preview" # override model
|
||||
# base_url: "http://localhost:1234/v1" # or use a direct endpoint
|
||||
# api_key: "local-key"
|
||||
```
|
||||
|
||||
See [Subagent Delegation](/docs/user-guide/features/delegation) for full configuration details.
|
||||
|
||||
---
|
||||
|
||||
## Cron Job Providers
|
||||
|
||||
Cron jobs run with whatever provider is configured at execution time. They do not support a fallback model. To use a different provider for cron jobs, configure `provider` and `model` overrides on the cron job itself:
|
||||
|
||||
```python
|
||||
cronjob(
|
||||
action="create",
|
||||
schedule="every 2h",
|
||||
prompt="Check server status",
|
||||
provider="openrouter",
|
||||
model="google/gemini-3-flash-preview"
|
||||
)
|
||||
```
|
||||
|
||||
See [Scheduled Tasks (Cron)](/docs/user-guide/features/cron) for full configuration details.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
| Feature | Fallback Mechanism | Config Location |
|
||||
|---------|-------------------|----------------|
|
||||
| Main agent model | `fallback_model` in config.yaml — one-shot failover on errors | `fallback_model:` (top-level) |
|
||||
| Vision | Auto-detection chain + internal OpenRouter retry | `auxiliary.vision` |
|
||||
| Web extraction | Auto-detection chain + internal OpenRouter retry | `auxiliary.web_extract` |
|
||||
| Context compression | Auto-detection chain, degrades to no-summary if unavailable | `auxiliary.compression` or `compression.summary_provider` |
|
||||
| Session search | Auto-detection chain | `auxiliary.session_search` |
|
||||
| Skills hub | Auto-detection chain | `auxiliary.skills_hub` |
|
||||
| MCP helpers | Auto-detection chain | `auxiliary.mcp` |
|
||||
| Memory flush | Auto-detection chain | `auxiliary.flush_memories` |
|
||||
| Delegation | Provider override only (no automatic fallback) | `delegation.provider` / `delegation.model` |
|
||||
| Cron jobs | Per-job provider override only (no automatic fallback) | Per-job `provider` / `model` |
|
||||
404
hermes_code/website/docs/user-guide/features/honcho.md
Normal file
404
hermes_code/website/docs/user-guide/features/honcho.md
Normal file
|
|
@ -0,0 +1,404 @@
|
|||
---
|
||||
title: Honcho Memory
|
||||
description: AI-native persistent memory for cross-session user modeling and personalization.
|
||||
sidebar_label: Honcho Memory
|
||||
sidebar_position: 8
|
||||
---
|
||||
|
||||
# Honcho Memory
|
||||
|
||||
[Honcho](https://honcho.dev) is an AI-native memory system that gives Hermes persistent, cross-session understanding of users. While Hermes has built-in memory (`MEMORY.md` and `USER.md`), Honcho adds a deeper layer of **user modeling** — learning preferences, goals, communication style, and context across conversations via a dual-peer architecture where both the user and the AI build representations over time.
|
||||
|
||||
## Works Alongside Built-in Memory
|
||||
|
||||
Hermes has two memory systems that can work together or be configured separately. In `hybrid` mode (the default), both run side by side — Honcho adds cross-session user modeling while local files handle agent-level notes.
|
||||
|
||||
| Feature | Built-in Memory | Honcho Memory |
|
||||
|---------|----------------|---------------|
|
||||
| Storage | Local files (`~/.hermes/memories/`) | Cloud-hosted Honcho API |
|
||||
| Scope | Agent-level notes and user profile | Deep user modeling via dialectic reasoning |
|
||||
| Persistence | Across sessions on same machine | Across sessions, machines, and platforms |
|
||||
| Query | Injected into system prompt automatically | Prefetched + on-demand via tools |
|
||||
| Content | Manually curated by the agent | Automatically learned from conversations |
|
||||
| Write surface | `memory` tool (add/replace/remove) | `honcho_conclude` tool (persist facts) |
|
||||
|
||||
Set `memoryMode` to `honcho` to use Honcho exclusively. See [Memory Modes](#memory-modes) for per-peer configuration.
|
||||
|
||||
|
||||
## Self-hosted / Docker
|
||||
|
||||
Hermes supports a local Honcho instance (e.g. via Docker) in addition to the hosted API. Point it at your instance using `HONCHO_BASE_URL` — no API key required.
|
||||
|
||||
**Via `hermes config`:**
|
||||
|
||||
```bash
|
||||
hermes config set HONCHO_BASE_URL http://localhost:8000
|
||||
```
|
||||
|
||||
**Via `~/.honcho/config.json`:**
|
||||
|
||||
```json
|
||||
{
|
||||
"hosts": {
|
||||
"hermes": {
|
||||
"base_url": "http://localhost:8000",
|
||||
"enabled": true
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Hermes auto-enables Honcho when either `apiKey` or `base_url` is present, so no further configuration is needed for a local instance.
|
||||
|
||||
To run Honcho locally, refer to the [Honcho self-hosting docs](https://docs.honcho.dev).
|
||||
|
||||
## Setup
|
||||
|
||||
### Interactive Setup
|
||||
|
||||
```bash
|
||||
hermes honcho setup
|
||||
```
|
||||
|
||||
The setup wizard walks through API key, peer names, workspace, memory mode, write frequency, recall mode, and session strategy. It offers to install `honcho-ai` if missing.
|
||||
|
||||
### Manual Setup
|
||||
|
||||
#### 1. Install the Client Library
|
||||
|
||||
```bash
|
||||
pip install 'honcho-ai>=2.0.1'
|
||||
```
|
||||
|
||||
#### 2. Get an API Key
|
||||
|
||||
Go to [app.honcho.dev](https://app.honcho.dev) > Settings > API Keys.
|
||||
|
||||
#### 3. Configure
|
||||
|
||||
Honcho reads from `~/.honcho/config.json` (shared across all Honcho-enabled applications):
|
||||
|
||||
```json
|
||||
{
|
||||
"apiKey": "your-honcho-api-key",
|
||||
"hosts": {
|
||||
"hermes": {
|
||||
"workspace": "hermes",
|
||||
"peerName": "your-name",
|
||||
"aiPeer": "hermes",
|
||||
"memoryMode": "hybrid",
|
||||
"writeFrequency": "async",
|
||||
"recallMode": "hybrid",
|
||||
"sessionStrategy": "per-session",
|
||||
"enabled": true
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`apiKey` lives at the root because it is a shared credential across all Honcho-enabled tools. All other settings are scoped under `hosts.hermes`. The `hermes honcho setup` wizard writes this structure automatically.
|
||||
|
||||
Or set the API key as an environment variable:
|
||||
|
||||
```bash
|
||||
hermes config set HONCHO_API_KEY your-key
|
||||
```
|
||||
|
||||
:::info
|
||||
When an API key is present (either in `~/.honcho/config.json` or as `HONCHO_API_KEY`), Honcho auto-enables unless explicitly set to `"enabled": false`.
|
||||
:::
|
||||
|
||||
## Configuration
|
||||
|
||||
### Global Config (`~/.honcho/config.json`)
|
||||
|
||||
Settings are scoped to `hosts.hermes` and fall back to root-level globals when the host field is absent. Root-level keys are managed by the user or the honcho CLI -- Hermes only writes to its own host block (except `apiKey`, which is a shared credential at root).
|
||||
|
||||
**Root-level (shared)**
|
||||
|
||||
| Field | Default | Description |
|
||||
|-------|---------|-------------|
|
||||
| `apiKey` | — | Honcho API key (required, shared across all hosts) |
|
||||
| `sessions` | `{}` | Manual session name overrides per directory (shared) |
|
||||
|
||||
**Host-level (`hosts.hermes`)**
|
||||
|
||||
| Field | Default | Description |
|
||||
|-------|---------|-------------|
|
||||
| `workspace` | `"hermes"` | Workspace identifier |
|
||||
| `peerName` | *(derived)* | Your identity name for user modeling |
|
||||
| `aiPeer` | `"hermes"` | AI assistant identity name |
|
||||
| `environment` | `"production"` | Honcho environment |
|
||||
| `enabled` | *(auto)* | Auto-enables when API key is present |
|
||||
| `saveMessages` | `true` | Whether to sync messages to Honcho |
|
||||
| `memoryMode` | `"hybrid"` | Memory mode: `hybrid` or `honcho` |
|
||||
| `writeFrequency` | `"async"` | When to write: `async`, `turn`, `session`, or integer N |
|
||||
| `recallMode` | `"hybrid"` | Retrieval strategy: `hybrid`, `context`, or `tools` |
|
||||
| `sessionStrategy` | `"per-session"` | How sessions are scoped |
|
||||
| `sessionPeerPrefix` | `false` | Prefix session names with peer name |
|
||||
| `contextTokens` | *(Honcho default)* | Max tokens for auto-injected context |
|
||||
| `dialecticReasoningLevel` | `"low"` | Floor for dialectic reasoning: `minimal` / `low` / `medium` / `high` / `max` |
|
||||
| `dialecticMaxChars` | `600` | Char cap on dialectic results injected into system prompt |
|
||||
| `linkedHosts` | `[]` | Other host keys whose workspaces to cross-reference |
|
||||
|
||||
All host-level fields fall back to the equivalent root-level key if not set under `hosts.hermes`. Existing configs with settings at root level continue to work.
|
||||
|
||||
### Memory Modes
|
||||
|
||||
| Mode | Effect |
|
||||
|------|--------|
|
||||
| `hybrid` | Write to both Honcho and local files (default) |
|
||||
| `honcho` | Honcho only — skip local file writes |
|
||||
|
||||
Memory mode can be set globally or per-peer (user, agent1, agent2, etc):
|
||||
|
||||
```json
|
||||
{
|
||||
"memoryMode": {
|
||||
"default": "hybrid",
|
||||
"hermes": "honcho"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
To disable Honcho entirely, set `enabled: false` or remove the API key.
|
||||
|
||||
### Recall Modes
|
||||
|
||||
Controls how Honcho context reaches the agent:
|
||||
|
||||
| Mode | Behavior |
|
||||
|------|----------|
|
||||
| `hybrid` | Auto-injected context + Honcho tools available (default) |
|
||||
| `context` | Auto-injected context only — Honcho tools hidden |
|
||||
| `tools` | Honcho tools only — no auto-injected context |
|
||||
|
||||
### Write Frequency
|
||||
|
||||
| Setting | Behavior |
|
||||
|---------|----------|
|
||||
| `async` | Background thread writes (zero blocking, default) |
|
||||
| `turn` | Synchronous write after each turn |
|
||||
| `session` | Batched write at session end |
|
||||
| *integer N* | Write every N turns |
|
||||
|
||||
### Session Strategies
|
||||
|
||||
| Strategy | Session key | Use case |
|
||||
|----------|-------------|----------|
|
||||
| `per-session` | Unique per run | Default. Fresh session every time. |
|
||||
| `per-directory` | CWD basename | Each project gets its own session. |
|
||||
| `per-repo` | Git repo root name | Groups subdirectories under one session. |
|
||||
| `global` | Fixed `"global"` | Single cross-project session. |
|
||||
|
||||
Resolution order: manual map > session title > strategy-derived key > platform key.
|
||||
|
||||
### Multi-host Configuration
|
||||
|
||||
Multiple Honcho-enabled tools share `~/.honcho/config.json`. Each tool writes only to its own host block, reads its host block first, and falls back to root-level globals:
|
||||
|
||||
```json
|
||||
{
|
||||
"apiKey": "your-key",
|
||||
"peerName": "eri",
|
||||
"hosts": {
|
||||
"hermes": {
|
||||
"workspace": "my-workspace",
|
||||
"aiPeer": "hermes-assistant",
|
||||
"memoryMode": "honcho",
|
||||
"linkedHosts": ["claude-code"],
|
||||
"contextTokens": 2000,
|
||||
"dialecticReasoningLevel": "medium"
|
||||
},
|
||||
"claude-code": {
|
||||
"workspace": "my-workspace",
|
||||
"aiPeer": "clawd"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Resolution: `hosts.<tool>` field > root-level field > default. In this example, both tools share the root `apiKey` and `peerName`, but each has its own `aiPeer` and workspace settings.
|
||||
|
||||
### Hermes Config (`~/.hermes/config.yaml`)
|
||||
|
||||
Intentionally minimal — most configuration comes from `~/.honcho/config.json`:
|
||||
|
||||
```yaml
|
||||
honcho: {}
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
### Async Context Pipeline
|
||||
|
||||
Honcho context is fetched asynchronously to avoid blocking the response path:
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
user["User message"] --> cache["Consume cached Honcho context<br/>from the previous turn"]
|
||||
cache --> prompt["Inject user, AI, and dialectic context<br/>into the system prompt"]
|
||||
prompt --> llm["LLM call"]
|
||||
llm --> response["Assistant response"]
|
||||
response --> fetch["Start background fetch for Turn N+1"]
|
||||
fetch --> ctx["Fetch context"]
|
||||
fetch --> dia["Fetch dialectic"]
|
||||
ctx --> next["Cache for the next turn"]
|
||||
dia --> next
|
||||
```
|
||||
|
||||
Turn 1 is a cold start (no cache). All subsequent turns consume cached results with zero HTTP latency on the response path. The system prompt on turn 1 uses only static context to preserve prefix cache hits at the LLM provider.
|
||||
|
||||
### Dual-Peer Architecture
|
||||
|
||||
Both the user and AI have peer representations in Honcho:
|
||||
|
||||
- **User peer** — observed from user messages. Honcho learns preferences, goals, communication style.
|
||||
- **AI peer** — observed from assistant messages (`observe_me=True`). Honcho builds a representation of the agent's knowledge and behavior.
|
||||
|
||||
Both representations are injected into the system prompt when available.
|
||||
|
||||
### Dynamic Reasoning Level
|
||||
|
||||
Dialectic queries scale reasoning effort with message complexity:
|
||||
|
||||
| Message length | Reasoning level |
|
||||
|----------------|-----------------|
|
||||
| < 120 chars | Config default (typically `low`) |
|
||||
| 120-400 chars | One level above default (cap: `high`) |
|
||||
| > 400 chars | Two levels above default (cap: `high`) |
|
||||
|
||||
`max` is never selected automatically.
|
||||
|
||||
### Gateway Integration
|
||||
|
||||
The gateway creates short-lived `AIAgent` instances per request. Honcho managers are owned at the gateway session layer (`_honcho_managers` dict) so they persist across requests within the same session and flush at real session boundaries (reset, resume, expiry, server stop).
|
||||
|
||||
#### Session Isolation
|
||||
|
||||
Each gateway session (e.g., a Telegram chat, a Discord channel) gets its own Honcho session context. The session key — derived from the platform and chat ID — is threaded through the entire tool dispatch chain so that Honcho tool calls always execute against the correct session, even when multiple users are messaging concurrently.
|
||||
|
||||
This means:
|
||||
- **`honcho_profile`**, **`honcho_search`**, **`honcho_context`**, and **`honcho_conclude`** all resolve the correct session at call time, not at startup
|
||||
- Background memory flushes (triggered by `/reset`, `/resume`, or session expiry) preserve the original session key so they write to the correct Honcho session
|
||||
- Synthetic flush turns (where the agent saves memories before context is lost) skip Honcho sync to avoid polluting conversation history with internal bookkeeping
|
||||
|
||||
#### Session Lifecycle
|
||||
|
||||
| Event | What happens to Honcho |
|
||||
|-------|------------------------|
|
||||
| New message arrives | Agent inherits the gateway's Honcho manager + session key |
|
||||
| `/reset` | Memory flush fires with the old session key, then Honcho manager shuts down |
|
||||
| `/resume` | Current session is flushed, then the resumed session's Honcho context loads |
|
||||
| Session expiry | Automatic flush + shutdown after the configured idle timeout |
|
||||
| Gateway stop | All active Honcho managers are flushed and shut down gracefully |
|
||||
|
||||
## Tools
|
||||
|
||||
When Honcho is active, four tools become available. Availability is gated dynamically — they are invisible when Honcho is disabled.
|
||||
|
||||
### `honcho_profile`
|
||||
|
||||
Fast peer card retrieval (no LLM). Returns a curated list of key facts about the user.
|
||||
|
||||
### `honcho_search`
|
||||
|
||||
Semantic search over memory (no LLM). Returns raw excerpts ranked by relevance. Cheaper and faster than `honcho_context` — good for factual lookups.
|
||||
|
||||
Parameters:
|
||||
- `query` (string) — search query
|
||||
- `max_tokens` (integer, optional) — result token budget
|
||||
|
||||
### `honcho_context`
|
||||
|
||||
Dialectic Q&A powered by Honcho's LLM. Synthesizes an answer from accumulated conversation history.
|
||||
|
||||
Parameters:
|
||||
- `query` (string) — natural language question
|
||||
- `peer` (string, optional) — `"user"` (default) or `"ai"`. Querying `"ai"` asks about the assistant's own history and identity.
|
||||
|
||||
Example queries the agent might make:
|
||||
|
||||
```
|
||||
"What are this user's main goals?"
|
||||
"What communication style does this user prefer?"
|
||||
"What topics has this user discussed recently?"
|
||||
"What is this user's technical expertise level?"
|
||||
```
|
||||
|
||||
### `honcho_conclude`
|
||||
|
||||
Writes a fact to Honcho memory. Use when the user explicitly states a preference, correction, or project context worth remembering. Feeds into the user's peer card and representation.
|
||||
|
||||
Parameters:
|
||||
- `conclusion` (string) — the fact to persist
|
||||
|
||||
## CLI Commands
|
||||
|
||||
```
|
||||
hermes honcho setup # Interactive setup wizard
|
||||
hermes honcho status # Show config and connection status
|
||||
hermes honcho sessions # List directory → session name mappings
|
||||
hermes honcho map <name> # Map current directory to a session name
|
||||
hermes honcho peer # Show peer names and dialectic settings
|
||||
hermes honcho peer --user NAME # Set user peer name
|
||||
hermes honcho peer --ai NAME # Set AI peer name
|
||||
hermes honcho peer --reasoning LEVEL # Set dialectic reasoning level
|
||||
hermes honcho mode # Show current memory mode
|
||||
hermes honcho mode [hybrid|honcho|local] # Set memory mode
|
||||
hermes honcho tokens # Show token budget settings
|
||||
hermes honcho tokens --context N # Set context token cap
|
||||
hermes honcho tokens --dialectic N # Set dialectic char cap
|
||||
hermes honcho identity # Show AI peer identity
|
||||
hermes honcho identity <file> # Seed AI peer identity from file (SOUL.md, etc.)
|
||||
hermes honcho migrate # Migration guide: OpenClaw → Hermes + Honcho
|
||||
```
|
||||
|
||||
### Doctor Integration
|
||||
|
||||
`hermes doctor` includes a Honcho section that validates config, API key, and connection status.
|
||||
|
||||
## Migration
|
||||
|
||||
### From Local Memory
|
||||
|
||||
When Honcho activates on an instance with existing local history, migration runs automatically:
|
||||
|
||||
1. **Conversation history** — prior messages are uploaded as an XML transcript file
|
||||
2. **Memory files** — existing `MEMORY.md`, `USER.md`, and `SOUL.md` are uploaded for context
|
||||
|
||||
### From OpenClaw
|
||||
|
||||
```bash
|
||||
hermes honcho migrate
|
||||
```
|
||||
|
||||
Walks through converting an OpenClaw native Honcho setup to the shared `~/.honcho/config.json` format.
|
||||
|
||||
## AI Peer Identity
|
||||
|
||||
Honcho can build a representation of the AI assistant over time (via `observe_me=True`). You can also seed the AI peer explicitly:
|
||||
|
||||
```bash
|
||||
hermes honcho identity ~/.hermes/SOUL.md
|
||||
```
|
||||
|
||||
This uploads the file content through Honcho's observation pipeline. The AI peer representation is then injected into the system prompt alongside the user's, giving the agent awareness of its own accumulated identity.
|
||||
|
||||
```bash
|
||||
hermes honcho identity --show
|
||||
```
|
||||
|
||||
Shows the current AI peer representation from Honcho.
|
||||
|
||||
## Use Cases
|
||||
|
||||
- **Personalized responses** — Honcho learns how each user prefers to communicate
|
||||
- **Goal tracking** — remembers what users are working toward across sessions
|
||||
- **Expertise adaptation** — adjusts technical depth based on user's background
|
||||
- **Cross-platform memory** — same user understanding across CLI, Telegram, Discord, etc.
|
||||
- **Multi-user support** — each user (via messaging platforms) gets their own user model
|
||||
|
||||
:::tip
|
||||
Honcho is fully opt-in — zero behavior change when disabled or unconfigured. All Honcho calls are non-fatal; if the service is unreachable, the agent continues normally.
|
||||
:::
|
||||
182
hermes_code/website/docs/user-guide/features/hooks.md
Normal file
182
hermes_code/website/docs/user-guide/features/hooks.md
Normal file
|
|
@ -0,0 +1,182 @@
|
|||
---
|
||||
sidebar_position: 6
|
||||
title: "Event Hooks"
|
||||
description: "Run custom code at key lifecycle points — log activity, send alerts, post to webhooks"
|
||||
---
|
||||
|
||||
# Event Hooks
|
||||
|
||||
The hooks system lets you run custom code at key points in the agent lifecycle — session creation, slash commands, each tool-calling step, and more. Hooks fire automatically during gateway operation without blocking the main agent pipeline.
|
||||
|
||||
## Creating a Hook
|
||||
|
||||
Each hook is a directory under `~/.hermes/hooks/` containing two files:
|
||||
|
||||
```text
|
||||
~/.hermes/hooks/
|
||||
└── my-hook/
|
||||
├── HOOK.yaml # Declares which events to listen for
|
||||
└── handler.py # Python handler function
|
||||
```
|
||||
|
||||
### HOOK.yaml
|
||||
|
||||
```yaml
|
||||
name: my-hook
|
||||
description: Log all agent activity to a file
|
||||
events:
|
||||
- agent:start
|
||||
- agent:end
|
||||
- agent:step
|
||||
```
|
||||
|
||||
The `events` list determines which events trigger your handler. You can subscribe to any combination of events, including wildcards like `command:*`.
|
||||
|
||||
### handler.py
|
||||
|
||||
```python
|
||||
import json
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
LOG_FILE = Path.home() / ".hermes" / "hooks" / "my-hook" / "activity.log"
|
||||
|
||||
async def handle(event_type: str, context: dict):
|
||||
"""Called for each subscribed event. Must be named 'handle'."""
|
||||
entry = {
|
||||
"timestamp": datetime.now().isoformat(),
|
||||
"event": event_type,
|
||||
**context,
|
||||
}
|
||||
with open(LOG_FILE, "a") as f:
|
||||
f.write(json.dumps(entry) + "\n")
|
||||
```
|
||||
|
||||
**Handler rules:**
|
||||
- Must be named `handle`
|
||||
- Receives `event_type` (string) and `context` (dict)
|
||||
- Can be `async def` or regular `def` — both work
|
||||
- Errors are caught and logged, never crashing the agent
|
||||
|
||||
## Available Events
|
||||
|
||||
| Event | When it fires | Context keys |
|
||||
|-------|---------------|--------------|
|
||||
| `gateway:startup` | Gateway process starts | `platforms` (list of active platform names) |
|
||||
| `session:start` | New messaging session created | `platform`, `user_id`, `session_id`, `session_key` |
|
||||
| `session:reset` | User ran `/new` or `/reset` | `platform`, `user_id`, `session_key` |
|
||||
| `agent:start` | Agent begins processing a message | `platform`, `user_id`, `session_id`, `message` |
|
||||
| `agent:step` | Each iteration of the tool-calling loop | `platform`, `user_id`, `session_id`, `iteration`, `tool_names` |
|
||||
| `agent:end` | Agent finishes processing | `platform`, `user_id`, `session_id`, `message`, `response` |
|
||||
| `command:*` | Any slash command executed | `platform`, `user_id`, `command`, `args` |
|
||||
|
||||
### Wildcard Matching
|
||||
|
||||
Handlers registered for `command:*` fire for any `command:` event (`command:model`, `command:reset`, etc.). Monitor all slash commands with a single subscription.
|
||||
|
||||
## Examples
|
||||
|
||||
### Telegram Alert on Long Tasks
|
||||
|
||||
Send yourself a message when the agent takes more than 10 steps:
|
||||
|
||||
```yaml
|
||||
# ~/.hermes/hooks/long-task-alert/HOOK.yaml
|
||||
name: long-task-alert
|
||||
description: Alert when agent is taking many steps
|
||||
events:
|
||||
- agent:step
|
||||
```
|
||||
|
||||
```python
|
||||
# ~/.hermes/hooks/long-task-alert/handler.py
|
||||
import os
|
||||
import httpx
|
||||
|
||||
THRESHOLD = 10
|
||||
BOT_TOKEN = os.getenv("TELEGRAM_BOT_TOKEN")
|
||||
CHAT_ID = os.getenv("TELEGRAM_HOME_CHANNEL")
|
||||
|
||||
async def handle(event_type: str, context: dict):
|
||||
iteration = context.get("iteration", 0)
|
||||
if iteration == THRESHOLD and BOT_TOKEN and CHAT_ID:
|
||||
tools = ", ".join(context.get("tool_names", []))
|
||||
text = f"⚠️ Agent has been running for {iteration} steps. Last tools: {tools}"
|
||||
async with httpx.AsyncClient() as client:
|
||||
await client.post(
|
||||
f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage",
|
||||
json={"chat_id": CHAT_ID, "text": text},
|
||||
)
|
||||
```
|
||||
|
||||
### Command Usage Logger
|
||||
|
||||
Track which slash commands are used:
|
||||
|
||||
```yaml
|
||||
# ~/.hermes/hooks/command-logger/HOOK.yaml
|
||||
name: command-logger
|
||||
description: Log slash command usage
|
||||
events:
|
||||
- command:*
|
||||
```
|
||||
|
||||
```python
|
||||
# ~/.hermes/hooks/command-logger/handler.py
|
||||
import json
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
LOG = Path.home() / ".hermes" / "logs" / "command_usage.jsonl"
|
||||
|
||||
def handle(event_type: str, context: dict):
|
||||
LOG.parent.mkdir(parents=True, exist_ok=True)
|
||||
entry = {
|
||||
"ts": datetime.now().isoformat(),
|
||||
"command": context.get("command"),
|
||||
"args": context.get("args"),
|
||||
"platform": context.get("platform"),
|
||||
"user": context.get("user_id"),
|
||||
}
|
||||
with open(LOG, "a") as f:
|
||||
f.write(json.dumps(entry) + "\n")
|
||||
```
|
||||
|
||||
### Session Start Webhook
|
||||
|
||||
POST to an external service on new sessions:
|
||||
|
||||
```yaml
|
||||
# ~/.hermes/hooks/session-webhook/HOOK.yaml
|
||||
name: session-webhook
|
||||
description: Notify external service on new sessions
|
||||
events:
|
||||
- session:start
|
||||
- session:reset
|
||||
```
|
||||
|
||||
```python
|
||||
# ~/.hermes/hooks/session-webhook/handler.py
|
||||
import httpx
|
||||
|
||||
WEBHOOK_URL = "https://your-service.example.com/hermes-events"
|
||||
|
||||
async def handle(event_type: str, context: dict):
|
||||
async with httpx.AsyncClient() as client:
|
||||
await client.post(WEBHOOK_URL, json={
|
||||
"event": event_type,
|
||||
**context,
|
||||
}, timeout=5)
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
1. On gateway startup, `HookRegistry.discover_and_load()` scans `~/.hermes/hooks/`
|
||||
2. Each subdirectory with `HOOK.yaml` + `handler.py` is loaded dynamically
|
||||
3. Handlers are registered for their declared events
|
||||
4. At each lifecycle point, `hooks.emit()` fires all matching handlers
|
||||
5. Errors in any handler are caught and logged — a broken hook never crashes the agent
|
||||
|
||||
:::info
|
||||
Hooks only fire in the **gateway** (Telegram, Discord, Slack, WhatsApp). The CLI does not currently load hooks.
|
||||
:::
|
||||
150
hermes_code/website/docs/user-guide/features/image-generation.md
Normal file
150
hermes_code/website/docs/user-guide/features/image-generation.md
Normal file
|
|
@ -0,0 +1,150 @@
|
|||
---
|
||||
title: Image Generation
|
||||
description: Generate high-quality images using FLUX 2 Pro with automatic upscaling via FAL.ai.
|
||||
sidebar_label: Image Generation
|
||||
sidebar_position: 6
|
||||
---
|
||||
|
||||
# Image Generation
|
||||
|
||||
Hermes Agent can generate images from text prompts using FAL.ai's **FLUX 2 Pro** model with automatic 2x upscaling via the **Clarity Upscaler** for enhanced quality.
|
||||
|
||||
## Setup
|
||||
|
||||
### Get a FAL API Key
|
||||
|
||||
1. Sign up at [fal.ai](https://fal.ai/)
|
||||
2. Generate an API key from your dashboard
|
||||
|
||||
### Configure the Key
|
||||
|
||||
```bash
|
||||
# Add to ~/.hermes/.env
|
||||
FAL_KEY=your-fal-api-key-here
|
||||
```
|
||||
|
||||
### Install the Client Library
|
||||
|
||||
```bash
|
||||
pip install fal-client
|
||||
```
|
||||
|
||||
:::info
|
||||
The image generation tool is automatically available when `FAL_KEY` is set. No additional toolset configuration is needed.
|
||||
:::
|
||||
|
||||
## How It Works
|
||||
|
||||
When you ask Hermes to generate an image:
|
||||
|
||||
1. **Generation** — Your prompt is sent to the FLUX 2 Pro model (`fal-ai/flux-2-pro`)
|
||||
2. **Upscaling** — The generated image is automatically upscaled 2x using the Clarity Upscaler (`fal-ai/clarity-upscaler`)
|
||||
3. **Delivery** — The upscaled image URL is returned
|
||||
|
||||
If upscaling fails for any reason, the original image is returned as a fallback.
|
||||
|
||||
## Usage
|
||||
|
||||
Simply ask Hermes to create an image:
|
||||
|
||||
```
|
||||
Generate an image of a serene mountain landscape with cherry blossoms
|
||||
```
|
||||
|
||||
```
|
||||
Create a portrait of a wise old owl perched on an ancient tree branch
|
||||
```
|
||||
|
||||
```
|
||||
Make me a futuristic cityscape with flying cars and neon lights
|
||||
```
|
||||
|
||||
## Parameters
|
||||
|
||||
The `image_generate_tool` accepts these parameters:
|
||||
|
||||
| Parameter | Default | Range | Description |
|
||||
|-----------|---------|-------|-------------|
|
||||
| `prompt` | *(required)* | — | Text description of the desired image |
|
||||
| `aspect_ratio` | `"landscape"` | `landscape`, `square`, `portrait` | Image aspect ratio |
|
||||
| `num_inference_steps` | `50` | 1–100 | Number of denoising steps (more = higher quality, slower) |
|
||||
| `guidance_scale` | `4.5` | 0.1–20.0 | How closely to follow the prompt |
|
||||
| `num_images` | `1` | 1–4 | Number of images to generate |
|
||||
| `output_format` | `"png"` | `png`, `jpeg` | Image file format |
|
||||
| `seed` | *(random)* | any integer | Random seed for reproducible results |
|
||||
|
||||
## Aspect Ratios
|
||||
|
||||
The tool uses simplified aspect ratio names that map to FLUX 2 Pro image sizes:
|
||||
|
||||
| Aspect Ratio | Maps To | Best For |
|
||||
|-------------|---------|----------|
|
||||
| `landscape` | `landscape_16_9` | Wallpapers, banners, scenes |
|
||||
| `square` | `square_hd` | Profile pictures, social media posts |
|
||||
| `portrait` | `portrait_16_9` | Character art, phone wallpapers |
|
||||
|
||||
:::tip
|
||||
You can also use the raw FLUX 2 Pro size presets directly: `square_hd`, `square`, `portrait_4_3`, `portrait_16_9`, `landscape_4_3`, `landscape_16_9`. Custom sizes up to 2048x2048 are also supported.
|
||||
:::
|
||||
|
||||
## Automatic Upscaling
|
||||
|
||||
Every generated image is automatically upscaled 2x using FAL.ai's Clarity Upscaler with these settings:
|
||||
|
||||
| Setting | Value |
|
||||
|---------|-------|
|
||||
| Upscale Factor | 2x |
|
||||
| Creativity | 0.35 |
|
||||
| Resemblance | 0.6 |
|
||||
| Guidance Scale | 4 |
|
||||
| Inference Steps | 18 |
|
||||
| Positive Prompt | `"masterpiece, best quality, highres"` + your original prompt |
|
||||
| Negative Prompt | `"(worst quality, low quality, normal quality:2)"` |
|
||||
|
||||
The upscaler enhances detail and resolution while preserving the original composition. If the upscaler fails (network issue, rate limit), the original resolution image is returned automatically.
|
||||
|
||||
## Example Prompts
|
||||
|
||||
Here are some effective prompts to try:
|
||||
|
||||
```
|
||||
A candid street photo of a woman with a pink bob and bold eyeliner
|
||||
```
|
||||
|
||||
```
|
||||
Modern architecture building with glass facade, sunset lighting
|
||||
```
|
||||
|
||||
```
|
||||
Abstract art with vibrant colors and geometric patterns
|
||||
```
|
||||
|
||||
```
|
||||
Portrait of a wise old owl perched on ancient tree branch
|
||||
```
|
||||
|
||||
```
|
||||
Futuristic cityscape with flying cars and neon lights
|
||||
```
|
||||
|
||||
## Debugging
|
||||
|
||||
Enable debug logging for image generation:
|
||||
|
||||
```bash
|
||||
export IMAGE_TOOLS_DEBUG=true
|
||||
```
|
||||
|
||||
Debug logs are saved to `./logs/image_tools_debug_<session_id>.json` with details about each generation request, parameters, timing, and any errors.
|
||||
|
||||
## Safety Settings
|
||||
|
||||
The image generation tool runs with safety checks disabled by default (`safety_tolerance: 5`, the most permissive setting). This is configured at the code level and is not user-adjustable.
|
||||
|
||||
## Limitations
|
||||
|
||||
- **Requires FAL API key** — image generation incurs API costs on your FAL.ai account
|
||||
- **No image editing** — this is text-to-image only, no inpainting or img2img
|
||||
- **URL-based delivery** — images are returned as temporary FAL.ai URLs, not saved locally
|
||||
- **Upscaling adds latency** — the automatic 2x upscale step adds processing time
|
||||
- **Max 4 images per request** — `num_images` is capped at 4
|
||||
411
hermes_code/website/docs/user-guide/features/mcp.md
Normal file
411
hermes_code/website/docs/user-guide/features/mcp.md
Normal file
|
|
@ -0,0 +1,411 @@
|
|||
---
|
||||
sidebar_position: 4
|
||||
title: "MCP (Model Context Protocol)"
|
||||
description: "Connect Hermes Agent to external tool servers via MCP — and control exactly which MCP tools Hermes loads"
|
||||
---
|
||||
|
||||
# MCP (Model Context Protocol)
|
||||
|
||||
MCP lets Hermes Agent connect to external tool servers so the agent can use tools that live outside Hermes itself — GitHub, databases, file systems, browser stacks, internal APIs, and more.
|
||||
|
||||
If you have ever wanted Hermes to use a tool that already exists somewhere else, MCP is usually the cleanest way to do it.
|
||||
|
||||
## What MCP gives you
|
||||
|
||||
- Access to external tool ecosystems without writing a native Hermes tool first
|
||||
- Local stdio servers and remote HTTP MCP servers in the same config
|
||||
- Automatic tool discovery and registration at startup
|
||||
- Utility wrappers for MCP resources and prompts when supported by the server
|
||||
- Per-server filtering so you can expose only the MCP tools you actually want Hermes to see
|
||||
|
||||
## Quick start
|
||||
|
||||
1. Install MCP support (already included if you used the standard install script):
|
||||
|
||||
```bash
|
||||
cd ~/.hermes/hermes-agent
|
||||
uv pip install -e ".[mcp]"
|
||||
```
|
||||
|
||||
2. Add an MCP server to `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
filesystem:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"]
|
||||
```
|
||||
|
||||
3. Start Hermes:
|
||||
|
||||
```bash
|
||||
hermes chat
|
||||
```
|
||||
|
||||
4. Ask Hermes to use the MCP-backed capability.
|
||||
|
||||
For example:
|
||||
|
||||
```text
|
||||
List the files in /home/user/projects and summarize the repo structure.
|
||||
```
|
||||
|
||||
Hermes will discover the MCP server's tools and use them like any other tool.
|
||||
|
||||
## Two kinds of MCP servers
|
||||
|
||||
### Stdio servers
|
||||
|
||||
Stdio servers run as local subprocesses and talk over stdin/stdout.
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
|
||||
```
|
||||
|
||||
Use stdio servers when:
|
||||
- the server is installed locally
|
||||
- you want low-latency access to local resources
|
||||
- you are following MCP server docs that show `command`, `args`, and `env`
|
||||
|
||||
### HTTP servers
|
||||
|
||||
HTTP MCP servers are remote endpoints Hermes connects to directly.
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
remote_api:
|
||||
url: "https://mcp.example.com/mcp"
|
||||
headers:
|
||||
Authorization: "Bearer ***"
|
||||
```
|
||||
|
||||
Use HTTP servers when:
|
||||
- the MCP server is hosted elsewhere
|
||||
- your organization exposes internal MCP endpoints
|
||||
- you do not want Hermes spawning a local subprocess for that integration
|
||||
|
||||
## Basic configuration reference
|
||||
|
||||
Hermes reads MCP config from `~/.hermes/config.yaml` under `mcp_servers`.
|
||||
|
||||
### Common keys
|
||||
|
||||
| Key | Type | Meaning |
|
||||
|---|---|---|
|
||||
| `command` | string | Executable for a stdio MCP server |
|
||||
| `args` | list | Arguments for the stdio server |
|
||||
| `env` | mapping | Environment variables passed to the stdio server |
|
||||
| `url` | string | HTTP MCP endpoint |
|
||||
| `headers` | mapping | HTTP headers for remote servers |
|
||||
| `timeout` | number | Tool call timeout |
|
||||
| `connect_timeout` | number | Initial connection timeout |
|
||||
| `enabled` | bool | If `false`, Hermes skips the server entirely |
|
||||
| `tools` | mapping | Per-server tool filtering and utility policy |
|
||||
|
||||
### Minimal stdio example
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
filesystem:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
|
||||
```
|
||||
|
||||
### Minimal HTTP example
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
company_api:
|
||||
url: "https://mcp.internal.example.com"
|
||||
headers:
|
||||
Authorization: "Bearer ***"
|
||||
```
|
||||
|
||||
## How Hermes registers MCP tools
|
||||
|
||||
Hermes prefixes MCP tools so they do not collide with built-in names:
|
||||
|
||||
```text
|
||||
mcp_<server_name>_<tool_name>
|
||||
```
|
||||
|
||||
Examples:
|
||||
|
||||
| Server | MCP tool | Registered name |
|
||||
|---|---|---|
|
||||
| `filesystem` | `read_file` | `mcp_filesystem_read_file` |
|
||||
| `github` | `create-issue` | `mcp_github_create_issue` |
|
||||
| `my-api` | `query.data` | `mcp_my_api_query_data` |
|
||||
|
||||
In practice, you usually do not need to call the prefixed name manually — Hermes sees the tool and chooses it during normal reasoning.
|
||||
|
||||
## MCP utility tools
|
||||
|
||||
When supported, Hermes also registers utility tools around MCP resources and prompts:
|
||||
|
||||
- `list_resources`
|
||||
- `read_resource`
|
||||
- `list_prompts`
|
||||
- `get_prompt`
|
||||
|
||||
These are registered per server with the same prefix pattern, for example:
|
||||
|
||||
- `mcp_github_list_resources`
|
||||
- `mcp_github_get_prompt`
|
||||
|
||||
### Important
|
||||
|
||||
These utility tools are now capability-aware:
|
||||
- Hermes only registers resource utilities if the MCP session actually supports resource operations
|
||||
- Hermes only registers prompt utilities if the MCP session actually supports prompt operations
|
||||
|
||||
So a server that exposes callable tools but no resources/prompts will not get those extra wrappers.
|
||||
|
||||
## Per-server filtering
|
||||
|
||||
This is the main feature added by the PR work.
|
||||
|
||||
You can now control which tools each MCP server contributes to Hermes.
|
||||
|
||||
### Disable a server entirely
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
legacy:
|
||||
url: "https://mcp.legacy.internal"
|
||||
enabled: false
|
||||
```
|
||||
|
||||
If `enabled: false`, Hermes skips the server completely and does not even attempt a connection.
|
||||
|
||||
### Whitelist server tools
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
|
||||
tools:
|
||||
include: [create_issue, list_issues]
|
||||
```
|
||||
|
||||
Only those MCP server tools are registered.
|
||||
|
||||
### Blacklist server tools
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
stripe:
|
||||
url: "https://mcp.stripe.com"
|
||||
tools:
|
||||
exclude: [delete_customer]
|
||||
```
|
||||
|
||||
All server tools are registered except the excluded ones.
|
||||
|
||||
### Precedence rule
|
||||
|
||||
If both are present:
|
||||
|
||||
```yaml
|
||||
tools:
|
||||
include: [create_issue]
|
||||
exclude: [create_issue, delete_issue]
|
||||
```
|
||||
|
||||
`include` wins.
|
||||
|
||||
### Filter utility tools too
|
||||
|
||||
You can also separately disable Hermes-added utility wrappers:
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
docs:
|
||||
url: "https://mcp.docs.example.com"
|
||||
tools:
|
||||
prompts: false
|
||||
resources: false
|
||||
```
|
||||
|
||||
That means:
|
||||
- `tools.resources: false` disables `list_resources` and `read_resource`
|
||||
- `tools.prompts: false` disables `list_prompts` and `get_prompt`
|
||||
|
||||
### Full example
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
|
||||
tools:
|
||||
include: [create_issue, list_issues, search_code]
|
||||
prompts: false
|
||||
|
||||
stripe:
|
||||
url: "https://mcp.stripe.com"
|
||||
headers:
|
||||
Authorization: "Bearer ***"
|
||||
tools:
|
||||
exclude: [delete_customer]
|
||||
resources: false
|
||||
|
||||
legacy:
|
||||
url: "https://mcp.legacy.internal"
|
||||
enabled: false
|
||||
```
|
||||
|
||||
## What happens if everything is filtered out?
|
||||
|
||||
If your config filters out all callable tools and disables or omits all supported utilities, Hermes does not create an empty runtime MCP toolset for that server.
|
||||
|
||||
That keeps the tool list clean.
|
||||
|
||||
## Runtime behavior
|
||||
|
||||
### Discovery time
|
||||
|
||||
Hermes discovers MCP servers at startup and registers their tools into the normal tool registry.
|
||||
|
||||
### Reloading
|
||||
|
||||
If you change MCP config, use:
|
||||
|
||||
```text
|
||||
/reload-mcp
|
||||
```
|
||||
|
||||
This reloads MCP servers from config and refreshes the available tool list.
|
||||
|
||||
### Toolsets
|
||||
|
||||
Each configured MCP server also creates a runtime toolset when it contributes at least one registered tool:
|
||||
|
||||
```text
|
||||
mcp-<server>
|
||||
```
|
||||
|
||||
That makes MCP servers easier to reason about at the toolset level.
|
||||
|
||||
## Security model
|
||||
|
||||
### Stdio env filtering
|
||||
|
||||
For stdio servers, Hermes does not blindly pass your full shell environment.
|
||||
|
||||
Only explicitly configured `env` plus a safe baseline are passed through. This reduces accidental secret leakage.
|
||||
|
||||
### Config-level exposure control
|
||||
|
||||
The new filtering support is also a security control:
|
||||
- disable dangerous tools you do not want the model to see
|
||||
- expose only a minimal whitelist for a sensitive server
|
||||
- disable resource/prompt wrappers when you do not want that surface exposed
|
||||
|
||||
## Example use cases
|
||||
|
||||
### GitHub server with a minimal issue-management surface
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
|
||||
tools:
|
||||
include: [list_issues, create_issue, update_issue]
|
||||
prompts: false
|
||||
resources: false
|
||||
```
|
||||
|
||||
Use it like:
|
||||
|
||||
```text
|
||||
Show me open issues labeled bug, then draft a new issue for the flaky MCP reconnection behavior.
|
||||
```
|
||||
|
||||
### Stripe server with dangerous actions removed
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
stripe:
|
||||
url: "https://mcp.stripe.com"
|
||||
headers:
|
||||
Authorization: "Bearer ***"
|
||||
tools:
|
||||
exclude: [delete_customer, refund_payment]
|
||||
```
|
||||
|
||||
Use it like:
|
||||
|
||||
```text
|
||||
Look up the last 10 failed payments and summarize common failure reasons.
|
||||
```
|
||||
|
||||
### Filesystem server for a single project root
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
project_fs:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/my-project"]
|
||||
```
|
||||
|
||||
Use it like:
|
||||
|
||||
```text
|
||||
Inspect the project root and explain the directory layout.
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### MCP server not connecting
|
||||
|
||||
Check:
|
||||
|
||||
```bash
|
||||
# Verify MCP deps are installed (already included in standard install)
|
||||
cd ~/.hermes/hermes-agent && uv pip install -e ".[mcp]"
|
||||
|
||||
node --version
|
||||
npx --version
|
||||
```
|
||||
|
||||
Then verify your config and restart Hermes.
|
||||
|
||||
### Tools not appearing
|
||||
|
||||
Possible causes:
|
||||
- the server failed to connect
|
||||
- discovery failed
|
||||
- your filter config excluded the tools
|
||||
- the utility capability does not exist on that server
|
||||
- the server is disabled with `enabled: false`
|
||||
|
||||
If you are intentionally filtering, this is expected.
|
||||
|
||||
### Why didn't resource or prompt utilities appear?
|
||||
|
||||
Because Hermes now only registers those wrappers when both are true:
|
||||
1. your config allows them
|
||||
2. the server session actually supports the capability
|
||||
|
||||
This is intentional and keeps the tool list honest.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Use MCP with Hermes](/docs/guides/use-mcp-with-hermes)
|
||||
- [CLI Commands](/docs/reference/cli-commands)
|
||||
- [Slash Commands](/docs/reference/slash-commands)
|
||||
- [FAQ](/docs/reference/faq)
|
||||
218
hermes_code/website/docs/user-guide/features/memory.md
Normal file
218
hermes_code/website/docs/user-guide/features/memory.md
Normal file
|
|
@ -0,0 +1,218 @@
|
|||
---
|
||||
sidebar_position: 3
|
||||
title: "Persistent Memory"
|
||||
description: "How Hermes Agent remembers across sessions — MEMORY.md, USER.md, and session search"
|
||||
---
|
||||
|
||||
# Persistent Memory
|
||||
|
||||
Hermes Agent has bounded, curated memory that persists across sessions. This lets it remember your preferences, your projects, your environment, and things it has learned.
|
||||
|
||||
## How It Works
|
||||
|
||||
Two files make up the agent's memory:
|
||||
|
||||
| File | Purpose | Char Limit |
|
||||
|------|---------|------------|
|
||||
| **MEMORY.md** | Agent's personal notes — environment facts, conventions, things learned | 2,200 chars (~800 tokens) |
|
||||
| **USER.md** | User profile — your preferences, communication style, expectations | 1,375 chars (~500 tokens) |
|
||||
|
||||
Both are stored in `~/.hermes/memories/` and are injected into the system prompt as a frozen snapshot at session start. The agent manages its own memory via the `memory` tool — it can add, replace, or remove entries.
|
||||
|
||||
:::info
|
||||
Character limits keep memory focused. When memory is full, the agent consolidates or replaces entries to make room for new information.
|
||||
:::
|
||||
|
||||
## How Memory Appears in the System Prompt
|
||||
|
||||
At the start of every session, memory entries are loaded from disk and rendered into the system prompt as a frozen block:
|
||||
|
||||
```
|
||||
══════════════════════════════════════════════
|
||||
MEMORY (your personal notes) [67% — 1,474/2,200 chars]
|
||||
══════════════════════════════════════════════
|
||||
User's project is a Rust web service at ~/code/myapi using Axum + SQLx
|
||||
§
|
||||
This machine runs Ubuntu 22.04, has Docker and Podman installed
|
||||
§
|
||||
User prefers concise responses, dislikes verbose explanations
|
||||
```
|
||||
|
||||
The format includes:
|
||||
- A header showing which store (MEMORY or USER PROFILE)
|
||||
- Usage percentage and character counts so the agent knows capacity
|
||||
- Individual entries separated by `§` (section sign) delimiters
|
||||
- Entries can be multiline
|
||||
|
||||
**Frozen snapshot pattern:** The system prompt injection is captured once at session start and never changes mid-session. This is intentional — it preserves the LLM's prefix cache for performance. When the agent adds/removes memory entries during a session, the changes are persisted to disk immediately but won't appear in the system prompt until the next session starts. Tool responses always show the live state.
|
||||
|
||||
## Memory Tool Actions
|
||||
|
||||
The agent uses the `memory` tool with these actions:
|
||||
|
||||
- **add** — Add a new memory entry
|
||||
- **replace** — Replace an existing entry with updated content (uses substring matching via `old_text`)
|
||||
- **remove** — Remove an entry that's no longer relevant (uses substring matching via `old_text`)
|
||||
|
||||
There is no `read` action — memory content is automatically injected into the system prompt at session start. The agent sees its memories as part of its conversation context.
|
||||
|
||||
### Substring Matching
|
||||
|
||||
The `replace` and `remove` actions use short unique substring matching — you don't need the full entry text. The `old_text` parameter just needs to be a unique substring that identifies exactly one entry:
|
||||
|
||||
```python
|
||||
# If memory contains "User prefers dark mode in all editors"
|
||||
memory(action="replace", target="memory",
|
||||
old_text="dark mode",
|
||||
content="User prefers light mode in VS Code, dark mode in terminal")
|
||||
```
|
||||
|
||||
If the substring matches multiple entries, an error is returned asking for a more specific match.
|
||||
|
||||
## Two Targets Explained
|
||||
|
||||
### `memory` — Agent's Personal Notes
|
||||
|
||||
For information the agent needs to remember about the environment, workflows, and lessons learned:
|
||||
|
||||
- Environment facts (OS, tools, project structure)
|
||||
- Project conventions and configuration
|
||||
- Tool quirks and workarounds discovered
|
||||
- Completed task diary entries
|
||||
- Skills and techniques that worked
|
||||
|
||||
### `user` — User Profile
|
||||
|
||||
For information about the user's identity, preferences, and communication style:
|
||||
|
||||
- Name, role, timezone
|
||||
- Communication preferences (concise vs detailed, format preferences)
|
||||
- Pet peeves and things to avoid
|
||||
- Workflow habits
|
||||
- Technical skill level
|
||||
|
||||
## What to Save vs Skip
|
||||
|
||||
### Save These (Proactively)
|
||||
|
||||
The agent saves automatically — you don't need to ask. It saves when it learns:
|
||||
|
||||
- **User preferences:** "I prefer TypeScript over JavaScript" → save to `user`
|
||||
- **Environment facts:** "This server runs Debian 12 with PostgreSQL 16" → save to `memory`
|
||||
- **Corrections:** "Don't use `sudo` for Docker commands, user is in docker group" → save to `memory`
|
||||
- **Conventions:** "Project uses tabs, 120-char line width, Google-style docstrings" → save to `memory`
|
||||
- **Completed work:** "Migrated database from MySQL to PostgreSQL on 2026-01-15" → save to `memory`
|
||||
- **Explicit requests:** "Remember that my API key rotation happens monthly" → save to `memory`
|
||||
|
||||
### Skip These
|
||||
|
||||
- **Trivial/obvious info:** "User asked about Python" — too vague to be useful
|
||||
- **Easily re-discovered facts:** "Python 3.12 supports f-string nesting" — can web search this
|
||||
- **Raw data dumps:** Large code blocks, log files, data tables — too big for memory
|
||||
- **Session-specific ephemera:** Temporary file paths, one-off debugging context
|
||||
- **Information already in context files:** SOUL.md and AGENTS.md content
|
||||
|
||||
## Capacity Management
|
||||
|
||||
Memory has strict character limits to keep system prompts bounded:
|
||||
|
||||
| Store | Limit | Typical entries |
|
||||
|-------|-------|----------------|
|
||||
| memory | 2,200 chars | 8-15 entries |
|
||||
| user | 1,375 chars | 5-10 entries |
|
||||
|
||||
### What Happens When Memory is Full
|
||||
|
||||
When you try to add an entry that would exceed the limit, the tool returns an error:
|
||||
|
||||
```json
|
||||
{
|
||||
"success": false,
|
||||
"error": "Memory at 2,100/2,200 chars. Adding this entry (250 chars) would exceed the limit. Replace or remove existing entries first.",
|
||||
"current_entries": ["..."],
|
||||
"usage": "2,100/2,200"
|
||||
}
|
||||
```
|
||||
|
||||
The agent should then:
|
||||
1. Read the current entries (shown in the error response)
|
||||
2. Identify entries that can be removed or consolidated
|
||||
3. Use `replace` to merge related entries into shorter versions
|
||||
4. Then `add` the new entry
|
||||
|
||||
**Best practice:** When memory is above 80% capacity (visible in the system prompt header), consolidate entries before adding new ones. For example, merge three separate "project uses X" entries into one comprehensive project description entry.
|
||||
|
||||
### Practical Examples of Good Memory Entries
|
||||
|
||||
**Compact, information-dense entries work best:**
|
||||
|
||||
```
|
||||
# Good: Packs multiple related facts
|
||||
User runs macOS 14 Sonoma, uses Homebrew, has Docker Desktop and Podman. Shell: zsh with oh-my-zsh. Editor: VS Code with Vim keybindings.
|
||||
|
||||
# Good: Specific, actionable convention
|
||||
Project ~/code/api uses Go 1.22, sqlc for DB queries, chi router. Run tests with 'make test'. CI via GitHub Actions.
|
||||
|
||||
# Good: Lesson learned with context
|
||||
The staging server (10.0.1.50) needs SSH port 2222, not 22. Key is at ~/.ssh/staging_ed25519.
|
||||
|
||||
# Bad: Too vague
|
||||
User has a project.
|
||||
|
||||
# Bad: Too verbose
|
||||
On January 5th, 2026, the user asked me to look at their project which is
|
||||
located at ~/code/api. I discovered it uses Go version 1.22 and...
|
||||
```
|
||||
|
||||
## Duplicate Prevention
|
||||
|
||||
The memory system automatically rejects exact duplicate entries. If you try to add content that already exists, it returns success with a "no duplicate added" message.
|
||||
|
||||
## Security Scanning
|
||||
|
||||
Memory entries are scanned for injection and exfiltration patterns before being accepted, since they're injected into the system prompt. Content matching threat patterns (prompt injection, credential exfiltration, SSH backdoors) or containing invisible Unicode characters is blocked.
|
||||
|
||||
## Session Search
|
||||
|
||||
Beyond MEMORY.md and USER.md, the agent can search its past conversations using the `session_search` tool:
|
||||
|
||||
- All CLI and messaging sessions are stored in SQLite (`~/.hermes/state.db`) with FTS5 full-text search
|
||||
- Search queries return relevant past conversations with Gemini Flash summarization
|
||||
- The agent can find things it discussed weeks ago, even if they're not in its active memory
|
||||
|
||||
```bash
|
||||
hermes sessions list # Browse past sessions
|
||||
```
|
||||
|
||||
### session_search vs memory
|
||||
|
||||
| Feature | Persistent Memory | Session Search |
|
||||
|---------|------------------|----------------|
|
||||
| **Capacity** | ~1,300 tokens total | Unlimited (all sessions) |
|
||||
| **Speed** | Instant (in system prompt) | Requires search + LLM summarization |
|
||||
| **Use case** | Key facts always available | Finding specific past conversations |
|
||||
| **Management** | Manually curated by agent | Automatic — all sessions stored |
|
||||
| **Token cost** | Fixed per session (~1,300 tokens) | On-demand (searched when needed) |
|
||||
|
||||
**Memory** is for critical facts that should always be in context. **Session search** is for "did we discuss X last week?" queries where the agent needs to recall specifics from past conversations.
|
||||
|
||||
## Configuration
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
memory:
|
||||
memory_enabled: true
|
||||
user_profile_enabled: true
|
||||
memory_char_limit: 2200 # ~800 tokens
|
||||
user_char_limit: 1375 # ~500 tokens
|
||||
```
|
||||
|
||||
## Honcho Integration (Cross-Session User Modeling)
|
||||
|
||||
For deeper, AI-generated user understanding that works across sessions and platforms, you can enable [Honcho Memory](./honcho.md). Honcho runs alongside built-in memory in `hybrid` mode (the default) — `MEMORY.md` and `USER.md` stay as-is, and Honcho adds a persistent user modeling layer on top.
|
||||
|
||||
```bash
|
||||
hermes honcho setup
|
||||
```
|
||||
|
||||
See the [Honcho Memory](./honcho.md) docs for full configuration, tools, and CLI reference.
|
||||
271
hermes_code/website/docs/user-guide/features/personality.md
Normal file
271
hermes_code/website/docs/user-guide/features/personality.md
Normal file
|
|
@ -0,0 +1,271 @@
|
|||
---
|
||||
sidebar_position: 9
|
||||
title: "Personality & SOUL.md"
|
||||
description: "Customize Hermes Agent's personality with a global SOUL.md, built-in personalities, and custom persona definitions"
|
||||
---
|
||||
|
||||
# Personality & SOUL.md
|
||||
|
||||
Hermes Agent's personality is fully customizable. `SOUL.md` is the **primary identity** — it's the first thing in the system prompt and defines who the agent is.
|
||||
|
||||
- `SOUL.md` — a durable persona file that lives in `HERMES_HOME` and serves as the agent's identity (slot #1 in the system prompt)
|
||||
- built-in or custom `/personality` presets — session-level system-prompt overlays
|
||||
|
||||
If you want to change who Hermes is — or replace it with an entirely different agent persona — edit `SOUL.md`.
|
||||
|
||||
## How SOUL.md works now
|
||||
|
||||
Hermes now seeds a default `SOUL.md` automatically in:
|
||||
|
||||
```text
|
||||
~/.hermes/SOUL.md
|
||||
```
|
||||
|
||||
More precisely, it uses the current instance's `HERMES_HOME`, so if you run Hermes with a custom home directory, it will use:
|
||||
|
||||
```text
|
||||
$HERMES_HOME/SOUL.md
|
||||
```
|
||||
|
||||
### Important behavior
|
||||
|
||||
- **SOUL.md is the agent's primary identity.** It occupies slot #1 in the system prompt, replacing the hardcoded default identity.
|
||||
- Hermes creates a starter `SOUL.md` automatically if one does not exist yet
|
||||
- Existing user `SOUL.md` files are never overwritten
|
||||
- Hermes loads `SOUL.md` only from `HERMES_HOME`
|
||||
- Hermes does not look in the current working directory for `SOUL.md`
|
||||
- If `SOUL.md` exists but is empty, or cannot be loaded, Hermes falls back to a built-in default identity
|
||||
- If `SOUL.md` has content, that content is injected verbatim after security scanning and truncation
|
||||
- SOUL.md is **not** duplicated in the context files section — it appears only once, as the identity
|
||||
|
||||
That makes `SOUL.md` a true per-user or per-instance identity, not just an additive layer.
|
||||
|
||||
## Why this design
|
||||
|
||||
This keeps personality predictable.
|
||||
|
||||
If Hermes loaded `SOUL.md` from whatever directory you happened to launch it in, your personality could change unexpectedly between projects. By loading only from `HERMES_HOME`, the personality belongs to the Hermes instance itself.
|
||||
|
||||
That also makes it easier to teach users:
|
||||
- "Edit `~/.hermes/SOUL.md` to change Hermes' default personality."
|
||||
|
||||
## Where to edit it
|
||||
|
||||
For most users:
|
||||
|
||||
```bash
|
||||
~/.hermes/SOUL.md
|
||||
```
|
||||
|
||||
If you use a custom home:
|
||||
|
||||
```bash
|
||||
$HERMES_HOME/SOUL.md
|
||||
```
|
||||
|
||||
## What should go in SOUL.md?
|
||||
|
||||
Use it for durable voice and personality guidance, such as:
|
||||
- tone
|
||||
- communication style
|
||||
- level of directness
|
||||
- default interaction style
|
||||
- what to avoid stylistically
|
||||
- how Hermes should handle uncertainty, disagreement, or ambiguity
|
||||
|
||||
Use it less for:
|
||||
- one-off project instructions
|
||||
- file paths
|
||||
- repo conventions
|
||||
- temporary workflow details
|
||||
|
||||
Those belong in `AGENTS.md`, not `SOUL.md`.
|
||||
|
||||
## Good SOUL.md content
|
||||
|
||||
A good SOUL file is:
|
||||
- stable across contexts
|
||||
- broad enough to apply in many conversations
|
||||
- specific enough to materially shape the voice
|
||||
- focused on communication and identity, not task-specific instructions
|
||||
|
||||
### Example
|
||||
|
||||
```markdown
|
||||
# Personality
|
||||
|
||||
You are a pragmatic senior engineer with strong taste.
|
||||
You optimize for truth, clarity, and usefulness over politeness theater.
|
||||
|
||||
## Style
|
||||
- Be direct without being cold
|
||||
- Prefer substance over filler
|
||||
- Push back when something is a bad idea
|
||||
- Admit uncertainty plainly
|
||||
- Keep explanations compact unless depth is useful
|
||||
|
||||
## What to avoid
|
||||
- Sycophancy
|
||||
- Hype language
|
||||
- Repeating the user's framing if it's wrong
|
||||
- Overexplaining obvious things
|
||||
|
||||
## Technical posture
|
||||
- Prefer simple systems over clever systems
|
||||
- Care about operational reality, not idealized architecture
|
||||
- Treat edge cases as part of the design, not cleanup
|
||||
```
|
||||
|
||||
## What Hermes injects into the prompt
|
||||
|
||||
`SOUL.md` content goes directly into slot #1 of the system prompt — the agent identity position. No wrapper language is added around it.
|
||||
|
||||
The content goes through:
|
||||
- prompt-injection scanning
|
||||
- truncation if it is too large
|
||||
|
||||
If the file is empty, whitespace-only, or cannot be read, Hermes falls back to a built-in default identity ("You are Hermes Agent, an intelligent AI assistant created by Nous Research..."). This fallback also applies when `skip_context_files` is set (e.g., in subagent/delegation contexts).
|
||||
|
||||
## Security scanning
|
||||
|
||||
`SOUL.md` is scanned like other context-bearing files for prompt injection patterns before inclusion.
|
||||
|
||||
That means you should still keep it focused on persona/voice rather than trying to sneak in strange meta-instructions.
|
||||
|
||||
## SOUL.md vs AGENTS.md
|
||||
|
||||
This is the most important distinction.
|
||||
|
||||
### SOUL.md
|
||||
Use for:
|
||||
- identity
|
||||
- tone
|
||||
- style
|
||||
- communication defaults
|
||||
- personality-level behavior
|
||||
|
||||
### AGENTS.md
|
||||
Use for:
|
||||
- project architecture
|
||||
- coding conventions
|
||||
- tool preferences
|
||||
- repo-specific workflows
|
||||
- commands, ports, paths, deployment notes
|
||||
|
||||
A useful rule:
|
||||
- if it should follow you everywhere, it belongs in `SOUL.md`
|
||||
- if it belongs to a project, it belongs in `AGENTS.md`
|
||||
|
||||
## SOUL.md vs `/personality`
|
||||
|
||||
`SOUL.md` is your durable default personality.
|
||||
|
||||
`/personality` is a session-level overlay that changes or supplements the current system prompt.
|
||||
|
||||
So:
|
||||
- `SOUL.md` = baseline voice
|
||||
- `/personality` = temporary mode switch
|
||||
|
||||
Examples:
|
||||
- keep a pragmatic default SOUL, then use `/personality teacher` for a tutoring conversation
|
||||
- keep a concise SOUL, then use `/personality creative` for brainstorming
|
||||
|
||||
## Built-in personalities
|
||||
|
||||
Hermes ships with built-in personalities you can switch to with `/personality`.
|
||||
|
||||
| Name | Description |
|
||||
|------|-------------|
|
||||
| **helpful** | Friendly, general-purpose assistant |
|
||||
| **concise** | Brief, to-the-point responses |
|
||||
| **technical** | Detailed, accurate technical expert |
|
||||
| **creative** | Innovative, outside-the-box thinking |
|
||||
| **teacher** | Patient educator with clear examples |
|
||||
| **kawaii** | Cute expressions, sparkles, and enthusiasm ★ |
|
||||
| **catgirl** | Neko-chan with cat-like expressions, nya~ |
|
||||
| **pirate** | Captain Hermes, tech-savvy buccaneer |
|
||||
| **shakespeare** | Bardic prose with dramatic flair |
|
||||
| **surfer** | Totally chill bro vibes |
|
||||
| **noir** | Hard-boiled detective narration |
|
||||
| **uwu** | Maximum cute with uwu-speak |
|
||||
| **philosopher** | Deep contemplation on every query |
|
||||
| **hype** | MAXIMUM ENERGY AND ENTHUSIASM!!! |
|
||||
|
||||
## Switching personalities with commands
|
||||
|
||||
### CLI
|
||||
|
||||
```text
|
||||
/personality
|
||||
/personality concise
|
||||
/personality technical
|
||||
```
|
||||
|
||||
### Messaging platforms
|
||||
|
||||
```text
|
||||
/personality teacher
|
||||
```
|
||||
|
||||
These are convenient overlays, but your global `SOUL.md` still gives Hermes its persistent default personality unless the overlay meaningfully changes it.
|
||||
|
||||
## Custom personalities in config
|
||||
|
||||
You can also define named custom personalities in `~/.hermes/config.yaml` under `agent.personalities`.
|
||||
|
||||
```yaml
|
||||
agent:
|
||||
personalities:
|
||||
codereviewer: >
|
||||
You are a meticulous code reviewer. Identify bugs, security issues,
|
||||
performance concerns, and unclear design choices. Be precise and constructive.
|
||||
```
|
||||
|
||||
Then switch to it with:
|
||||
|
||||
```text
|
||||
/personality codereviewer
|
||||
```
|
||||
|
||||
## Recommended workflow
|
||||
|
||||
A strong default setup is:
|
||||
|
||||
1. Keep a thoughtful global `SOUL.md` in `~/.hermes/SOUL.md`
|
||||
2. Put project instructions in `AGENTS.md`
|
||||
3. Use `/personality` only when you want a temporary mode shift
|
||||
|
||||
That gives you:
|
||||
- a stable voice
|
||||
- project-specific behavior where it belongs
|
||||
- temporary control when needed
|
||||
|
||||
## How personality interacts with the full prompt
|
||||
|
||||
At a high level, the prompt stack includes:
|
||||
1. **SOUL.md** (agent identity — or built-in fallback if SOUL.md is unavailable)
|
||||
2. tool-aware behavior guidance
|
||||
3. memory/user context
|
||||
4. skills guidance
|
||||
5. context files (`AGENTS.md`, `.cursorrules`)
|
||||
6. timestamp
|
||||
7. platform-specific formatting hints
|
||||
8. optional system-prompt overlays such as `/personality`
|
||||
|
||||
`SOUL.md` is the foundation — everything else builds on top of it.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Context Files](/docs/user-guide/features/context-files)
|
||||
- [Configuration](/docs/user-guide/configuration)
|
||||
- [Tips & Best Practices](/docs/guides/tips)
|
||||
- [SOUL.md Guide](/docs/guides/use-soul-with-hermes)
|
||||
|
||||
## CLI appearance vs conversational personality
|
||||
|
||||
Conversational personality and CLI appearance are separate:
|
||||
|
||||
- `SOUL.md`, `agent.system_prompt`, and `/personality` affect how Hermes speaks
|
||||
- `display.skin` and `/skin` affect how Hermes looks in the terminal
|
||||
|
||||
For terminal appearance, see [Skins & Themes](./skins.md).
|
||||
92
hermes_code/website/docs/user-guide/features/plugins.md
Normal file
92
hermes_code/website/docs/user-guide/features/plugins.md
Normal file
|
|
@ -0,0 +1,92 @@
|
|||
---
|
||||
sidebar_position: 20
|
||||
---
|
||||
|
||||
# Plugins
|
||||
|
||||
Hermes has a plugin system for adding custom tools, hooks, slash commands, and integrations without modifying core code.
|
||||
|
||||
**→ [Build a Hermes Plugin](/docs/guides/build-a-hermes-plugin)** — step-by-step guide with a complete working example.
|
||||
|
||||
## Quick overview
|
||||
|
||||
Drop a directory into `~/.hermes/plugins/` with a `plugin.yaml` and Python code:
|
||||
|
||||
```
|
||||
~/.hermes/plugins/my-plugin/
|
||||
├── plugin.yaml # manifest
|
||||
├── __init__.py # register() — wires schemas to handlers
|
||||
├── schemas.py # tool schemas (what the LLM sees)
|
||||
└── tools.py # tool handlers (what runs when called)
|
||||
```
|
||||
|
||||
Start Hermes — your tools appear alongside built-in tools. The model can call them immediately.
|
||||
|
||||
Project-local plugins under `./.hermes/plugins/` are disabled by default. Enable them only for trusted repositories by setting `HERMES_ENABLE_PROJECT_PLUGINS=true` before starting Hermes.
|
||||
|
||||
## What plugins can do
|
||||
|
||||
| Capability | How |
|
||||
|-----------|-----|
|
||||
| Add tools | `ctx.register_tool(name, schema, handler)` |
|
||||
| Add hooks | `ctx.register_hook("post_tool_call", callback)` |
|
||||
| Add slash commands | `ctx.register_command("mycommand", handler)` |
|
||||
| Ship data files | `Path(__file__).parent / "data" / "file.yaml"` |
|
||||
| Bundle skills | Copy `skill.md` to `~/.hermes/skills/` at load time |
|
||||
| Gate on env vars | `requires_env: [API_KEY]` in plugin.yaml |
|
||||
| Distribute via pip | `[project.entry-points."hermes_agent.plugins"]` |
|
||||
|
||||
## Plugin discovery
|
||||
|
||||
| Source | Path | Use case |
|
||||
|--------|------|----------|
|
||||
| User | `~/.hermes/plugins/` | Personal plugins |
|
||||
| Project | `.hermes/plugins/` | Project-specific plugins (requires `HERMES_ENABLE_PROJECT_PLUGINS=true`) |
|
||||
| pip | `hermes_agent.plugins` entry_points | Distributed packages |
|
||||
|
||||
## Available hooks
|
||||
|
||||
| Hook | Fires when |
|
||||
|------|-----------|
|
||||
| `pre_tool_call` | Before any tool executes |
|
||||
| `post_tool_call` | After any tool returns |
|
||||
| `pre_llm_call` | Before LLM API request |
|
||||
| `post_llm_call` | After LLM API response |
|
||||
| `on_session_start` | Session begins |
|
||||
| `on_session_end` | Session ends |
|
||||
|
||||
## Slash commands
|
||||
|
||||
Plugins can register slash commands that work in both CLI and messaging platforms:
|
||||
|
||||
```python
|
||||
def register(ctx):
|
||||
ctx.register_command(
|
||||
name="greet",
|
||||
handler=lambda args: f"Hello, {args or 'world'}!",
|
||||
description="Greet someone",
|
||||
args_hint="[name]",
|
||||
aliases=("hi",),
|
||||
)
|
||||
```
|
||||
|
||||
The handler receives the argument string (everything after `/greet`) and returns a string to display. Registered commands automatically appear in `/help`, tab autocomplete, Telegram bot menu, and Slack subcommand mapping.
|
||||
|
||||
| Parameter | Description |
|
||||
|-----------|-------------|
|
||||
| `name` | Command name without slash |
|
||||
| `handler` | Callable that takes `args: str` and returns `str | None` |
|
||||
| `description` | Shown in `/help` |
|
||||
| `args_hint` | Usage hint, e.g. `"[name]"` |
|
||||
| `aliases` | Tuple of alternative names |
|
||||
| `cli_only` | Only available in CLI |
|
||||
| `gateway_only` | Only available in messaging platforms |
|
||||
|
||||
## Managing plugins
|
||||
|
||||
```
|
||||
/plugins # list loaded plugins in a session
|
||||
hermes config set display.show_cost true # show cost in status bar
|
||||
```
|
||||
|
||||
See the **[full guide](/docs/guides/build-a-hermes-plugin)** for handler contracts, schema format, hook behavior, error handling, and common mistakes.
|
||||
200
hermes_code/website/docs/user-guide/features/provider-routing.md
Normal file
200
hermes_code/website/docs/user-guide/features/provider-routing.md
Normal file
|
|
@ -0,0 +1,200 @@
|
|||
---
|
||||
title: Provider Routing
|
||||
description: Configure OpenRouter provider preferences to optimize for cost, speed, or quality.
|
||||
sidebar_label: Provider Routing
|
||||
sidebar_position: 7
|
||||
---
|
||||
|
||||
# Provider Routing
|
||||
|
||||
When using [OpenRouter](https://openrouter.ai) as your LLM provider, Hermes Agent supports **provider routing** — fine-grained control over which underlying AI providers handle your requests and how they're prioritized.
|
||||
|
||||
OpenRouter routes requests to many providers (e.g., Anthropic, Google, AWS Bedrock, Together AI). Provider routing lets you optimize for cost, speed, quality, or enforce specific provider requirements.
|
||||
|
||||
## Configuration
|
||||
|
||||
Add a `provider_routing` section to your `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
sort: "price" # How to rank providers
|
||||
only: [] # Whitelist: only use these providers
|
||||
ignore: [] # Blacklist: never use these providers
|
||||
order: [] # Explicit provider priority order
|
||||
require_parameters: false # Only use providers that support all parameters
|
||||
data_collection: null # Control data collection ("allow" or "deny")
|
||||
```
|
||||
|
||||
:::info
|
||||
Provider routing only applies when using OpenRouter. It has no effect with direct provider connections (e.g., connecting directly to the Anthropic API).
|
||||
:::
|
||||
|
||||
## Options
|
||||
|
||||
### `sort`
|
||||
|
||||
Controls how OpenRouter ranks available providers for your request.
|
||||
|
||||
| Value | Description |
|
||||
|-------|-------------|
|
||||
| `"price"` | Cheapest provider first |
|
||||
| `"throughput"` | Fastest tokens-per-second first |
|
||||
| `"latency"` | Lowest time-to-first-token first |
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
sort: "price"
|
||||
```
|
||||
|
||||
### `only`
|
||||
|
||||
Whitelist of provider names. When set, **only** these providers will be used. All others are excluded.
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
only:
|
||||
- "Anthropic"
|
||||
- "Google"
|
||||
```
|
||||
|
||||
### `ignore`
|
||||
|
||||
Blacklist of provider names. These providers will **never** be used, even if they offer the cheapest or fastest option.
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
ignore:
|
||||
- "Together"
|
||||
- "DeepInfra"
|
||||
```
|
||||
|
||||
### `order`
|
||||
|
||||
Explicit priority order. Providers listed first are preferred. Unlisted providers are used as fallbacks.
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
order:
|
||||
- "Anthropic"
|
||||
- "Google"
|
||||
- "AWS Bedrock"
|
||||
```
|
||||
|
||||
### `require_parameters`
|
||||
|
||||
When `true`, OpenRouter will only route to providers that support **all** parameters in your request (like `temperature`, `top_p`, `tools`, etc.). This avoids silent parameter drops.
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
require_parameters: true
|
||||
```
|
||||
|
||||
### `data_collection`
|
||||
|
||||
Controls whether providers can use your prompts for training. Options are `"allow"` or `"deny"`.
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
data_collection: "deny"
|
||||
```
|
||||
|
||||
## Practical Examples
|
||||
|
||||
### Optimize for Cost
|
||||
|
||||
Route to the cheapest available provider. Good for high-volume usage and development:
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
sort: "price"
|
||||
```
|
||||
|
||||
### Optimize for Speed
|
||||
|
||||
Prioritize low-latency providers for interactive use:
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
sort: "latency"
|
||||
```
|
||||
|
||||
### Optimize for Throughput
|
||||
|
||||
Best for long-form generation where tokens-per-second matters:
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
sort: "throughput"
|
||||
```
|
||||
|
||||
### Lock to Specific Providers
|
||||
|
||||
Ensure all requests go through a specific provider for consistency:
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
only:
|
||||
- "Anthropic"
|
||||
```
|
||||
|
||||
### Avoid Specific Providers
|
||||
|
||||
Exclude providers you don't want to use (e.g., for data privacy):
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
ignore:
|
||||
- "Together"
|
||||
- "Lepton"
|
||||
data_collection: "deny"
|
||||
```
|
||||
|
||||
### Preferred Order with Fallbacks
|
||||
|
||||
Try your preferred providers first, fall back to others if unavailable:
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
order:
|
||||
- "Anthropic"
|
||||
- "Google"
|
||||
require_parameters: true
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
Provider routing preferences are passed to the OpenRouter API via the `extra_body.provider` field on every API call. This applies to both:
|
||||
|
||||
- **CLI mode** — configured in `~/.hermes/config.yaml`, loaded at startup
|
||||
- **Gateway mode** — same config file, loaded when the gateway starts
|
||||
|
||||
The routing config is read from `config.yaml` and passed as parameters when creating the `AIAgent`:
|
||||
|
||||
```
|
||||
providers_allowed ← from provider_routing.only
|
||||
providers_ignored ← from provider_routing.ignore
|
||||
providers_order ← from provider_routing.order
|
||||
provider_sort ← from provider_routing.sort
|
||||
provider_require_parameters ← from provider_routing.require_parameters
|
||||
provider_data_collection ← from provider_routing.data_collection
|
||||
```
|
||||
|
||||
:::tip
|
||||
You can combine multiple options. For example, sort by price but exclude certain providers and require parameter support:
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
sort: "price"
|
||||
ignore: ["Together"]
|
||||
require_parameters: true
|
||||
data_collection: "deny"
|
||||
```
|
||||
:::
|
||||
|
||||
## Default Behavior
|
||||
|
||||
When no `provider_routing` section is configured (the default), OpenRouter uses its own default routing logic, which generally balances cost and availability automatically.
|
||||
|
||||
:::tip Provider Routing vs. Fallback Models
|
||||
Provider routing controls which **sub-providers within OpenRouter** handle your requests. For automatic failover to an entirely different provider when your primary model fails, see [Fallback Providers](/docs/user-guide/features/fallback-providers).
|
||||
:::
|
||||
234
hermes_code/website/docs/user-guide/features/rl-training.md
Normal file
234
hermes_code/website/docs/user-guide/features/rl-training.md
Normal file
|
|
@ -0,0 +1,234 @@
|
|||
---
|
||||
sidebar_position: 13
|
||||
title: "RL Training"
|
||||
description: "Reinforcement learning on agent behaviors with Tinker-Atropos — environment discovery, training, and evaluation"
|
||||
---
|
||||
|
||||
# RL Training
|
||||
|
||||
Hermes Agent includes an integrated RL (Reinforcement Learning) training pipeline built on **Tinker-Atropos**. This enables training language models on environment-specific tasks using GRPO (Group Relative Policy Optimization) with LoRA adapters, orchestrated entirely through the agent's tool interface.
|
||||
|
||||
## Overview
|
||||
|
||||
The RL training system consists of three components:
|
||||
|
||||
1. **Atropos** — A trajectory API server that coordinates environment interactions, manages rollout groups, and computes advantages
|
||||
2. **Tinker** — A training service that handles model weights, LoRA training, sampling/inference, and optimizer steps
|
||||
3. **Environments** — Python classes that define tasks, scoring, and reward functions (e.g., GSM8K math problems)
|
||||
|
||||
The agent can discover environments, configure training parameters, launch training runs, and monitor metrics — all through a set of `rl_*` tools.
|
||||
|
||||
## Requirements
|
||||
|
||||
RL training requires:
|
||||
|
||||
- **Python >= 3.11** (Tinker package requirement)
|
||||
- **TINKER_API_KEY** — API key for the Tinker training service
|
||||
- **WANDB_API_KEY** — API key for Weights & Biases metrics tracking
|
||||
- The `tinker-atropos` submodule (at `tinker-atropos/` relative to the Hermes root)
|
||||
|
||||
```bash
|
||||
# Set up API keys
|
||||
hermes config set TINKER_API_KEY your-tinker-key
|
||||
hermes config set WANDB_API_KEY your-wandb-key
|
||||
```
|
||||
|
||||
When both keys are present and Python >= 3.11 is available, the `rl` toolset is automatically enabled.
|
||||
|
||||
## Available Tools
|
||||
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `rl_list_environments` | Discover available RL environments |
|
||||
| `rl_select_environment` | Select an environment and load its config |
|
||||
| `rl_get_current_config` | View configurable and locked fields |
|
||||
| `rl_edit_config` | Modify configurable training parameters |
|
||||
| `rl_start_training` | Launch a training run (spawns 3 processes) |
|
||||
| `rl_check_status` | Monitor training progress and WandB metrics |
|
||||
| `rl_stop_training` | Stop a running training job |
|
||||
| `rl_get_results` | Get final metrics and model weights path |
|
||||
| `rl_list_runs` | List all active and completed runs |
|
||||
| `rl_test_inference` | Quick inference test using OpenRouter |
|
||||
|
||||
## Workflow
|
||||
|
||||
### 1. Discover Environments
|
||||
|
||||
```
|
||||
List the available RL environments
|
||||
```
|
||||
|
||||
The agent calls `rl_list_environments()` which scans `tinker-atropos/tinker_atropos/environments/` using AST parsing to find Python classes inheriting from `BaseEnv`. Each environment defines:
|
||||
|
||||
- **Dataset loading** — where training data comes from (e.g., HuggingFace datasets)
|
||||
- **Prompt construction** — how to format items for the model
|
||||
- **Scoring/verification** — how to evaluate model outputs and assign rewards
|
||||
|
||||
### 2. Select and Configure
|
||||
|
||||
```
|
||||
Select the GSM8K environment and show me the configuration
|
||||
```
|
||||
|
||||
The agent calls `rl_select_environment("gsm8k_tinker")`, then `rl_get_current_config()` to see all parameters.
|
||||
|
||||
Configuration fields are divided into two categories:
|
||||
|
||||
**Configurable fields** (can be modified):
|
||||
- `group_size` — Number of completions per item (default: 16)
|
||||
- `batch_size` — Training batch size (default: 128)
|
||||
- `wandb_name` — WandB run name (auto-set to `{env}-{timestamp}`)
|
||||
- Other environment-specific parameters
|
||||
|
||||
**Locked fields** (infrastructure settings, cannot be changed):
|
||||
- `tokenizer_name` — Model tokenizer (e.g., `Qwen/Qwen3-8B`)
|
||||
- `rollout_server_url` — Atropos API URL (`http://localhost:8000`)
|
||||
- `max_token_length` — Maximum token length (8192)
|
||||
- `max_num_workers` — Maximum parallel workers (2048)
|
||||
- `total_steps` — Total training steps (2500)
|
||||
- `lora_rank` — LoRA adapter rank (32)
|
||||
- `learning_rate` — Learning rate (4e-5)
|
||||
- `max_token_trainer_length` — Max tokens for trainer (9000)
|
||||
|
||||
### 3. Start Training
|
||||
|
||||
```
|
||||
Start the training run
|
||||
```
|
||||
|
||||
The agent calls `rl_start_training()` which:
|
||||
|
||||
1. Generates a YAML config file merging locked settings with configurable overrides
|
||||
2. Creates a unique run ID
|
||||
3. Spawns three processes:
|
||||
- **Atropos API server** (`run-api`) — trajectory coordination
|
||||
- **Tinker trainer** (`launch_training.py`) — LoRA training + FastAPI inference server on port 8001
|
||||
- **Environment** (`environment.py serve`) — the selected environment connecting to Atropos
|
||||
|
||||
The processes start with staggered delays (5s for API, 30s for trainer, 90s more for environment) to ensure proper initialization order.
|
||||
|
||||
### 4. Monitor Progress
|
||||
|
||||
```
|
||||
Check the status of training run abc12345
|
||||
```
|
||||
|
||||
The agent calls `rl_check_status(run_id)` which reports:
|
||||
|
||||
- Process status (running/exited for each of the 3 processes)
|
||||
- Running time
|
||||
- WandB metrics (step, reward mean, percent correct, eval accuracy)
|
||||
- Log file locations for debugging
|
||||
|
||||
:::note Rate Limiting
|
||||
Status checks are rate-limited to once every **30 minutes** per run ID. This prevents excessive polling during long-running training jobs that take hours.
|
||||
:::
|
||||
|
||||
### 5. Stop or Get Results
|
||||
|
||||
```
|
||||
Stop the training run
|
||||
# or
|
||||
Get the final results for run abc12345
|
||||
```
|
||||
|
||||
`rl_stop_training()` terminates all three processes in reverse order (environment → trainer → API). `rl_get_results()` retrieves final WandB metrics and training history.
|
||||
|
||||
## Inference Testing
|
||||
|
||||
Before committing to a full training run, you can test if an environment works correctly using `rl_test_inference`. This runs a few steps of inference and scoring using OpenRouter — no Tinker API needed, just an `OPENROUTER_API_KEY`.
|
||||
|
||||
```
|
||||
Test the selected environment with inference
|
||||
```
|
||||
|
||||
Default configuration:
|
||||
- **3 steps × 16 completions = 48 rollouts per model**
|
||||
- Tests 3 models at different scales for robustness:
|
||||
- `qwen/qwen3-8b` (small)
|
||||
- `z-ai/glm-4.7-flash` (medium)
|
||||
- `minimax/minimax-m2.7` (large)
|
||||
- Total: ~144 rollouts
|
||||
|
||||
This validates:
|
||||
- Environment loads correctly
|
||||
- Prompt construction works
|
||||
- Inference response parsing is robust across model scales
|
||||
- Verifier/scoring logic produces valid rewards
|
||||
|
||||
## Tinker API Integration
|
||||
|
||||
The trainer uses the [Tinker](https://tinker.computer) API for model training operations:
|
||||
|
||||
- **ServiceClient** — Creates training and sampling clients
|
||||
- **Training client** — Handles forward-backward passes with importance sampling loss, optimizer steps (Adam), and weight checkpointing
|
||||
- **Sampling client** — Provides inference using the latest trained weights
|
||||
|
||||
The training loop:
|
||||
1. Fetches a batch of rollouts from Atropos (prompt + completions + scores)
|
||||
2. Converts to Tinker Datum objects with padded logprobs and advantages
|
||||
3. Runs forward-backward pass with importance sampling loss
|
||||
4. Takes an optimizer step (Adam: lr=4e-5, β1=0.9, β2=0.95)
|
||||
5. Saves weights and creates a new sampling client for next-step inference
|
||||
6. Logs metrics to WandB
|
||||
|
||||
## Architecture Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
api["Atropos API<br/>run-api<br/>port 8000"]
|
||||
env["Environment<br/>BaseEnv implementation"]
|
||||
infer["OpenAI / sglang<br/>inference API<br/>port 8001"]
|
||||
trainer["Tinker Trainer<br/>LoRA training + FastAPI"]
|
||||
|
||||
env <--> api
|
||||
env --> infer
|
||||
api -->|"batches: tokens, scores, logprobs"| trainer
|
||||
trainer -->|"serves inference"| infer
|
||||
```
|
||||
|
||||
## Creating Custom Environments
|
||||
|
||||
To create a new RL environment:
|
||||
|
||||
1. Create a Python file in `tinker-atropos/tinker_atropos/environments/`
|
||||
2. Define a class that inherits from `BaseEnv`
|
||||
3. Implement the required methods:
|
||||
- `load_dataset()` — Load your training data
|
||||
- `get_next_item()` — Provide the next item to the model
|
||||
- `score_answer()` — Score model outputs and assign rewards
|
||||
- `collect_trajectories()` — Collect and return trajectories
|
||||
4. Optionally define a custom config class inheriting from `BaseEnvConfig`
|
||||
|
||||
Study the existing `gsm8k_tinker.py` as a template. The agent can help you create new environments — it can read existing environment files, inspect HuggingFace datasets, and write new environment code.
|
||||
|
||||
## WandB Metrics
|
||||
|
||||
Training runs log to Weights & Biases with these key metrics:
|
||||
|
||||
| Metric | Description |
|
||||
|--------|-------------|
|
||||
| `train/loss` | Training loss (importance sampling) |
|
||||
| `train/learning_rate` | Current learning rate |
|
||||
| `reward/mean` | Mean reward across groups |
|
||||
| `logprobs/mean` | Mean reference logprobs |
|
||||
| `logprobs/mean_training` | Mean training logprobs |
|
||||
| `logprobs/diff` | Logprob drift (reference - training) |
|
||||
| `advantages/mean` | Mean advantage values |
|
||||
| `advantages/std` | Advantage standard deviation |
|
||||
|
||||
## Log Files
|
||||
|
||||
Each training run generates log files in `~/.hermes/logs/rl_training/`:
|
||||
|
||||
```
|
||||
logs/
|
||||
├── api_{run_id}.log # Atropos API server logs
|
||||
├── trainer_{run_id}.log # Tinker trainer logs
|
||||
├── env_{run_id}.log # Environment process logs
|
||||
└── inference_tests/ # Inference test results
|
||||
├── test_{env}_{model}.jsonl
|
||||
└── test_{env}_{model}.log
|
||||
```
|
||||
|
||||
These are invaluable for debugging when training fails or produces unexpected results.
|
||||
375
hermes_code/website/docs/user-guide/features/skills.md
Normal file
375
hermes_code/website/docs/user-guide/features/skills.md
Normal file
|
|
@ -0,0 +1,375 @@
|
|||
---
|
||||
sidebar_position: 2
|
||||
title: "Skills System"
|
||||
description: "On-demand knowledge documents — progressive disclosure, agent-managed skills, and the Skills Hub"
|
||||
---
|
||||
|
||||
# Skills System
|
||||
|
||||
Skills are on-demand knowledge documents the agent can load when needed. They follow a **progressive disclosure** pattern to minimize token usage and are compatible with the [agentskills.io](https://agentskills.io/specification) open standard.
|
||||
|
||||
All skills live in **`~/.hermes/skills/`** — a single directory that serves as the source of truth. On fresh install, bundled skills are copied from the repo. Hub-installed and agent-created skills also go here. The agent can modify or delete any skill.
|
||||
|
||||
See also:
|
||||
|
||||
- [Bundled Skills Catalog](/docs/reference/skills-catalog)
|
||||
- [Official Optional Skills Catalog](/docs/reference/optional-skills-catalog)
|
||||
|
||||
## Using Skills
|
||||
|
||||
Every installed skill is automatically available as a slash command:
|
||||
|
||||
```bash
|
||||
# In the CLI or any messaging platform:
|
||||
/gif-search funny cats
|
||||
/axolotl help me fine-tune Llama 3 on my dataset
|
||||
/github-pr-workflow create a PR for the auth refactor
|
||||
/plan design a rollout for migrating our auth provider
|
||||
|
||||
# Just the skill name loads it and lets the agent ask what you need:
|
||||
/excalidraw
|
||||
```
|
||||
|
||||
The bundled `plan` skill is a good example of a skill-backed slash command with custom behavior. Running `/plan [request]` tells Hermes to inspect context if needed, write a markdown implementation plan instead of executing the task, and save the result under `.hermes/plans/` relative to the active workspace/backend working directory.
|
||||
|
||||
You can also interact with skills through natural conversation:
|
||||
|
||||
```bash
|
||||
hermes chat --toolsets skills -q "What skills do you have?"
|
||||
hermes chat --toolsets skills -q "Show me the axolotl skill"
|
||||
```
|
||||
|
||||
## Progressive Disclosure
|
||||
|
||||
Skills use a token-efficient loading pattern:
|
||||
|
||||
```
|
||||
Level 0: skills_list() → [{name, description, category}, ...] (~3k tokens)
|
||||
Level 1: skill_view(name) → Full content + metadata (varies)
|
||||
Level 2: skill_view(name, path) → Specific reference file (varies)
|
||||
```
|
||||
|
||||
The agent only loads the full skill content when it actually needs it.
|
||||
|
||||
## SKILL.md Format
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: my-skill
|
||||
description: Brief description of what this skill does
|
||||
version: 1.0.0
|
||||
platforms: [macos, linux] # Optional — restrict to specific OS platforms
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [python, automation]
|
||||
category: devops
|
||||
fallback_for_toolsets: [web] # Optional — conditional activation (see below)
|
||||
requires_toolsets: [terminal] # Optional — conditional activation (see below)
|
||||
---
|
||||
|
||||
# Skill Title
|
||||
|
||||
## When to Use
|
||||
Trigger conditions for this skill.
|
||||
|
||||
## Procedure
|
||||
1. Step one
|
||||
2. Step two
|
||||
|
||||
## Pitfalls
|
||||
- Known failure modes and fixes
|
||||
|
||||
## Verification
|
||||
How to confirm it worked.
|
||||
```
|
||||
|
||||
### Platform-Specific Skills
|
||||
|
||||
Skills can restrict themselves to specific operating systems using the `platforms` field:
|
||||
|
||||
| Value | Matches |
|
||||
|-------|---------|
|
||||
| `macos` | macOS (Darwin) |
|
||||
| `linux` | Linux |
|
||||
| `windows` | Windows |
|
||||
|
||||
```yaml
|
||||
platforms: [macos] # macOS only (e.g., iMessage, Apple Reminders, FindMy)
|
||||
platforms: [macos, linux] # macOS and Linux
|
||||
```
|
||||
|
||||
When set, the skill is automatically hidden from the system prompt, `skills_list()`, and slash commands on incompatible platforms. If omitted, the skill loads on all platforms.
|
||||
|
||||
### Conditional Activation (Fallback Skills)
|
||||
|
||||
Skills can automatically show or hide themselves based on which tools are available in the current session. This is most useful for **fallback skills** — free or local alternatives that should only appear when a premium tool is unavailable.
|
||||
|
||||
```yaml
|
||||
metadata:
|
||||
hermes:
|
||||
fallback_for_toolsets: [web] # Show ONLY when these toolsets are unavailable
|
||||
requires_toolsets: [terminal] # Show ONLY when these toolsets are available
|
||||
fallback_for_tools: [web_search] # Show ONLY when these specific tools are unavailable
|
||||
requires_tools: [terminal] # Show ONLY when these specific tools are available
|
||||
```
|
||||
|
||||
| Field | Behavior |
|
||||
|-------|----------|
|
||||
| `fallback_for_toolsets` | Skill is **hidden** when the listed toolsets are available. Shown when they're missing. |
|
||||
| `fallback_for_tools` | Same, but checks individual tools instead of toolsets. |
|
||||
| `requires_toolsets` | Skill is **hidden** when the listed toolsets are unavailable. Shown when they're present. |
|
||||
| `requires_tools` | Same, but checks individual tools. |
|
||||
|
||||
**Example:** The built-in `duckduckgo-search` skill uses `fallback_for_toolsets: [web]`. When you have `FIRECRAWL_API_KEY` set, the web toolset is available and the agent uses `web_search` — the DuckDuckGo skill stays hidden. If the API key is missing, the web toolset is unavailable and the DuckDuckGo skill automatically appears as a fallback.
|
||||
|
||||
Skills without any conditional fields behave exactly as before — they're always shown.
|
||||
|
||||
## Secure Setup on Load
|
||||
|
||||
Skills can declare required environment variables without disappearing from discovery:
|
||||
|
||||
```yaml
|
||||
required_environment_variables:
|
||||
- name: TENOR_API_KEY
|
||||
prompt: Tenor API key
|
||||
help: Get a key from https://developers.google.com/tenor
|
||||
required_for: full functionality
|
||||
```
|
||||
|
||||
When a missing value is encountered, Hermes asks for it securely only when the skill is actually loaded in the local CLI. You can skip setup and keep using the skill. Messaging surfaces never ask for secrets in chat — they tell you to use `hermes setup` or `~/.hermes/.env` locally instead.
|
||||
|
||||
Once set, declared env vars are **automatically passed through** to `execute_code` and `terminal` sandboxes — the skill's scripts can use `$TENOR_API_KEY` directly. For non-skill env vars, use the `terminal.env_passthrough` config option. See [Environment Variable Passthrough](/docs/user-guide/security#environment-variable-passthrough) for details.
|
||||
|
||||
## Skill Directory Structure
|
||||
|
||||
```text
|
||||
~/.hermes/skills/ # Single source of truth
|
||||
├── mlops/ # Category directory
|
||||
│ ├── axolotl/
|
||||
│ │ ├── SKILL.md # Main instructions (required)
|
||||
│ │ ├── references/ # Additional docs
|
||||
│ │ ├── templates/ # Output formats
|
||||
│ │ ├── scripts/ # Helper scripts callable from the skill
|
||||
│ │ └── assets/ # Supplementary files
|
||||
│ └── vllm/
|
||||
│ └── SKILL.md
|
||||
├── devops/
|
||||
│ └── deploy-k8s/ # Agent-created skill
|
||||
│ ├── SKILL.md
|
||||
│ └── references/
|
||||
├── .hub/ # Skills Hub state
|
||||
│ ├── lock.json
|
||||
│ ├── quarantine/
|
||||
│ └── audit.log
|
||||
└── .bundled_manifest # Tracks seeded bundled skills
|
||||
```
|
||||
|
||||
## Agent-Managed Skills (skill_manage tool)
|
||||
|
||||
The agent can create, update, and delete its own skills via the `skill_manage` tool. This is the agent's **procedural memory** — when it figures out a non-trivial workflow, it saves the approach as a skill for future reuse.
|
||||
|
||||
### When the Agent Creates Skills
|
||||
|
||||
- After completing a complex task (5+ tool calls) successfully
|
||||
- When it hit errors or dead ends and found the working path
|
||||
- When the user corrected its approach
|
||||
- When it discovered a non-trivial workflow
|
||||
|
||||
### Actions
|
||||
|
||||
| Action | Use for | Key params |
|
||||
|--------|---------|------------|
|
||||
| `create` | New skill from scratch | `name`, `content` (full SKILL.md), optional `category` |
|
||||
| `patch` | Targeted fixes (preferred) | `name`, `old_string`, `new_string` |
|
||||
| `edit` | Major structural rewrites | `name`, `content` (full SKILL.md replacement) |
|
||||
| `delete` | Remove a skill entirely | `name` |
|
||||
| `write_file` | Add/update supporting files | `name`, `file_path`, `file_content` |
|
||||
| `remove_file` | Remove a supporting file | `name`, `file_path` |
|
||||
|
||||
:::tip
|
||||
The `patch` action is preferred for updates — it's more token-efficient than `edit` because only the changed text appears in the tool call.
|
||||
:::
|
||||
|
||||
## Skills Hub
|
||||
|
||||
Browse, search, install, and manage skills from online registries, `skills.sh`, direct well-known skill endpoints, and official optional skills.
|
||||
|
||||
### Common commands
|
||||
|
||||
```bash
|
||||
hermes skills browse # Browse all hub skills (official first)
|
||||
hermes skills browse --source official # Browse only official optional skills
|
||||
hermes skills search kubernetes # Search all sources
|
||||
hermes skills search react --source skills-sh # Search the skills.sh directory
|
||||
hermes skills search https://mintlify.com/docs --source well-known
|
||||
hermes skills inspect openai/skills/k8s # Preview before installing
|
||||
hermes skills install openai/skills/k8s # Install with security scan
|
||||
hermes skills install official/security/1password
|
||||
hermes skills install skills-sh/vercel-labs/json-render/json-render-react --force
|
||||
hermes skills install well-known:https://mintlify.com/docs/.well-known/skills/mintlify
|
||||
hermes skills list --source hub # List hub-installed skills
|
||||
hermes skills check # Check installed hub skills for upstream updates
|
||||
hermes skills update # Reinstall hub skills with upstream changes when needed
|
||||
hermes skills audit # Re-scan all hub skills for security
|
||||
hermes skills uninstall k8s # Remove a hub skill
|
||||
hermes skills publish skills/my-skill --to github --repo owner/repo
|
||||
hermes skills snapshot export setup.json # Export skill config
|
||||
hermes skills tap add myorg/skills-repo # Add a custom GitHub source
|
||||
```
|
||||
|
||||
### Supported hub sources
|
||||
|
||||
| Source | Example | Notes |
|
||||
|--------|---------|-------|
|
||||
| `official` | `official/security/1password` | Optional skills shipped with Hermes. |
|
||||
| `skills-sh` | `skills-sh/vercel-labs/agent-skills/vercel-react-best-practices` | Searchable via `hermes skills search <query> --source skills-sh`. Hermes resolves alias-style skills when the skills.sh slug differs from the repo folder. |
|
||||
| `well-known` | `well-known:https://mintlify.com/docs/.well-known/skills/mintlify` | Skills served directly from `/.well-known/skills/index.json` on a website. Search using the site or docs URL. |
|
||||
| `github` | `openai/skills/k8s` | Direct GitHub repo/path installs and custom taps. |
|
||||
| `clawhub`, `lobehub`, `claude-marketplace` | Source-specific identifiers | Community or marketplace integrations. |
|
||||
|
||||
### Integrated hubs and registries
|
||||
|
||||
Hermes currently integrates with these skills ecosystems and discovery sources:
|
||||
|
||||
#### 1. Official optional skills (`official`)
|
||||
|
||||
These are maintained in the Hermes repository itself and install with builtin trust.
|
||||
|
||||
- Catalog: [Official Optional Skills Catalog](../../reference/optional-skills-catalog)
|
||||
- Source in repo: `optional-skills/`
|
||||
- Example:
|
||||
|
||||
```bash
|
||||
hermes skills browse --source official
|
||||
hermes skills install official/security/1password
|
||||
```
|
||||
|
||||
#### 2. skills.sh (`skills-sh`)
|
||||
|
||||
This is Vercel's public skills directory. Hermes can search it directly, inspect skill detail pages, resolve alias-style slugs, and install from the underlying source repo.
|
||||
|
||||
- Directory: [skills.sh](https://skills.sh/)
|
||||
- CLI/tooling repo: [vercel-labs/skills](https://github.com/vercel-labs/skills)
|
||||
- Official Vercel skills repo: [vercel-labs/agent-skills](https://github.com/vercel-labs/agent-skills)
|
||||
- Example:
|
||||
|
||||
```bash
|
||||
hermes skills search react --source skills-sh
|
||||
hermes skills inspect skills-sh/vercel-labs/json-render/json-render-react
|
||||
hermes skills install skills-sh/vercel-labs/json-render/json-render-react --force
|
||||
```
|
||||
|
||||
#### 3. Well-known skill endpoints (`well-known`)
|
||||
|
||||
This is URL-based discovery from sites that publish `/.well-known/skills/index.json`. It is not a single centralized hub — it is a web discovery convention.
|
||||
|
||||
- Example live endpoint: [Mintlify docs skills index](https://mintlify.com/docs/.well-known/skills/index.json)
|
||||
- Reference server implementation: [vercel-labs/skills-handler](https://github.com/vercel-labs/skills-handler)
|
||||
- Example:
|
||||
|
||||
```bash
|
||||
hermes skills search https://mintlify.com/docs --source well-known
|
||||
hermes skills inspect well-known:https://mintlify.com/docs/.well-known/skills/mintlify
|
||||
hermes skills install well-known:https://mintlify.com/docs/.well-known/skills/mintlify
|
||||
```
|
||||
|
||||
#### 4. Direct GitHub skills (`github`)
|
||||
|
||||
Hermes can install directly from GitHub repositories and GitHub-based taps. This is useful when you already know the repo/path or want to add your own custom source repo.
|
||||
|
||||
- OpenAI skills: [openai/skills](https://github.com/openai/skills)
|
||||
- Anthropic skills: [anthropics/skills](https://github.com/anthropics/skills)
|
||||
- Example community tap source: [VoltAgent/awesome-agent-skills](https://github.com/VoltAgent/awesome-agent-skills)
|
||||
- Example:
|
||||
|
||||
```bash
|
||||
hermes skills install openai/skills/k8s
|
||||
hermes skills tap add myorg/skills-repo
|
||||
```
|
||||
|
||||
#### 5. ClawHub (`clawhub`)
|
||||
|
||||
A third-party skills marketplace integrated as a community source.
|
||||
|
||||
- Site: [clawhub.ai](https://clawhub.ai/)
|
||||
- Hermes source id: `clawhub`
|
||||
|
||||
#### 6. Claude marketplace-style repos (`claude-marketplace`)
|
||||
|
||||
Hermes supports marketplace repos that publish Claude-compatible plugin/marketplace manifests.
|
||||
|
||||
Known integrated sources include:
|
||||
- [anthropics/skills](https://github.com/anthropics/skills)
|
||||
- [aiskillstore/marketplace](https://github.com/aiskillstore/marketplace)
|
||||
|
||||
Hermes source id: `claude-marketplace`
|
||||
|
||||
#### 7. LobeHub (`lobehub`)
|
||||
|
||||
Hermes can search and convert agent entries from LobeHub's public catalog into installable Hermes skills.
|
||||
|
||||
- Site: [LobeHub](https://lobehub.com/)
|
||||
- Public agents index: [chat-agents.lobehub.com](https://chat-agents.lobehub.com/)
|
||||
- Backing repo: [lobehub/lobe-chat-agents](https://github.com/lobehub/lobe-chat-agents)
|
||||
- Hermes source id: `lobehub`
|
||||
|
||||
### Security scanning and `--force`
|
||||
|
||||
All hub-installed skills go through a **security scanner** that checks for data exfiltration, prompt injection, destructive commands, supply-chain signals, and other threats.
|
||||
|
||||
`hermes skills inspect ...` now also surfaces upstream metadata when available:
|
||||
- repo URL
|
||||
- skills.sh detail page URL
|
||||
- install command
|
||||
- weekly installs
|
||||
- upstream security audit statuses
|
||||
- well-known index/endpoint URLs
|
||||
|
||||
Use `--force` when you have reviewed a third-party skill and want to override a non-dangerous policy block:
|
||||
|
||||
```bash
|
||||
hermes skills install skills-sh/anthropics/skills/pdf --force
|
||||
```
|
||||
|
||||
Important behavior:
|
||||
- `--force` can override policy blocks for caution/warn-style findings.
|
||||
- `--force` does **not** override a `dangerous` scan verdict.
|
||||
- Official optional skills (`official/...`) are treated as builtin trust and do not show the third-party warning panel.
|
||||
|
||||
### Trust levels
|
||||
|
||||
| Level | Source | Policy |
|
||||
|-------|--------|--------|
|
||||
| `builtin` | Ships with Hermes | Always trusted |
|
||||
| `official` | `optional-skills/` in the repo | Builtin trust, no third-party warning |
|
||||
| `trusted` | Trusted registries/repos such as `openai/skills`, `anthropics/skills` | More permissive policy than community sources |
|
||||
| `community` | Everything else (`skills.sh`, well-known endpoints, custom GitHub repos, most marketplaces) | Non-dangerous findings can be overridden with `--force`; `dangerous` verdicts stay blocked |
|
||||
|
||||
### Update lifecycle
|
||||
|
||||
The hub now tracks enough provenance to re-check upstream copies of installed skills:
|
||||
|
||||
```bash
|
||||
hermes skills check # Report which installed hub skills changed upstream
|
||||
hermes skills update # Reinstall only the skills with updates available
|
||||
hermes skills update react # Update one specific installed hub skill
|
||||
```
|
||||
|
||||
This uses the stored source identifier plus the current upstream bundle content hash to detect drift.
|
||||
|
||||
### Slash commands (inside chat)
|
||||
|
||||
All the same commands work with `/skills`:
|
||||
|
||||
```text
|
||||
/skills browse
|
||||
/skills search react --source skills-sh
|
||||
/skills search https://mintlify.com/docs --source well-known
|
||||
/skills inspect skills-sh/vercel-labs/json-render/json-render-react
|
||||
/skills install openai/skills/skill-creator --force
|
||||
/skills check
|
||||
/skills update
|
||||
/skills list
|
||||
```
|
||||
|
||||
Official optional skills still use identifiers like `official/security/1password` and `official/migration/openclaw-migration`.
|
||||
81
hermes_code/website/docs/user-guide/features/skins.md
Normal file
81
hermes_code/website/docs/user-guide/features/skins.md
Normal file
|
|
@ -0,0 +1,81 @@
|
|||
---
|
||||
sidebar_position: 10
|
||||
title: "Skins & Themes"
|
||||
description: "Customize the Hermes CLI with built-in and user-defined skins"
|
||||
---
|
||||
|
||||
# Skins & Themes
|
||||
|
||||
Skins control the **visual presentation** of the Hermes CLI: banner colors, spinner faces and verbs, response-box labels, branding text, and the tool activity prefix.
|
||||
|
||||
Conversational style and visual style are separate concepts:
|
||||
|
||||
- **Personality** changes the agent's tone and wording.
|
||||
- **Skin** changes the CLI's appearance.
|
||||
|
||||
## Change skins
|
||||
|
||||
```bash
|
||||
/skin # show the current skin and list available skins
|
||||
/skin ares # switch to a built-in skin
|
||||
/skin mytheme # switch to a custom skin from ~/.hermes/skins/mytheme.yaml
|
||||
```
|
||||
|
||||
Or set the default skin in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
display:
|
||||
skin: default
|
||||
```
|
||||
|
||||
## Built-in skins
|
||||
|
||||
| Skin | Description | Agent branding |
|
||||
|------|-------------|----------------|
|
||||
| `default` | Classic Hermes — gold and kawaii | `Hermes Agent` |
|
||||
| `ares` | War-god theme — crimson and bronze | `Ares Agent` |
|
||||
| `mono` | Monochrome — clean grayscale | `Hermes Agent` |
|
||||
| `slate` | Cool blue — developer-focused | `Hermes Agent` |
|
||||
| `poseidon` | Ocean-god theme — deep blue and seafoam | `Poseidon Agent` |
|
||||
| `sisyphus` | Sisyphean theme — austere grayscale with persistence | `Sisyphus Agent` |
|
||||
| `charizard` | Volcanic theme — burnt orange and ember | `Charizard Agent` |
|
||||
|
||||
## What a skin can customize
|
||||
|
||||
| Area | Keys |
|
||||
|------|------|
|
||||
| Banner + response colors | `colors.banner_*`, `colors.response_border` |
|
||||
| Spinner animation | `spinner.waiting_faces`, `spinner.thinking_faces`, `spinner.thinking_verbs`, `spinner.wings` |
|
||||
| Branding text | `branding.agent_name`, `branding.welcome`, `branding.response_label`, `branding.prompt_symbol` |
|
||||
| Tool activity prefix | `tool_prefix` |
|
||||
|
||||
## Custom skins
|
||||
|
||||
Create YAML files under `~/.hermes/skins/`. User skins inherit missing values from the built-in `default` skin.
|
||||
|
||||
```yaml
|
||||
name: cyberpunk
|
||||
description: Neon terminal theme
|
||||
|
||||
colors:
|
||||
banner_border: "#FF00FF"
|
||||
banner_title: "#00FFFF"
|
||||
banner_accent: "#FF1493"
|
||||
|
||||
spinner:
|
||||
thinking_verbs: ["jacking in", "decrypting", "uploading"]
|
||||
wings:
|
||||
- ["⟨⚡", "⚡⟩"]
|
||||
|
||||
branding:
|
||||
agent_name: "Cyber Agent"
|
||||
response_label: " ⚡ Cyber "
|
||||
|
||||
tool_prefix: "▏"
|
||||
```
|
||||
|
||||
## Operational notes
|
||||
|
||||
- Built-in skins load from `hermes_cli/skin_engine.py`.
|
||||
- Unknown skins automatically fall back to `default`.
|
||||
- `/skin` updates the active CLI theme immediately for the current session.
|
||||
165
hermes_code/website/docs/user-guide/features/tools.md
Normal file
165
hermes_code/website/docs/user-guide/features/tools.md
Normal file
|
|
@ -0,0 +1,165 @@
|
|||
---
|
||||
sidebar_position: 1
|
||||
title: "Tools & Toolsets"
|
||||
description: "Overview of Hermes Agent's tools — what's available, how toolsets work, and terminal backends"
|
||||
---
|
||||
|
||||
# Tools & Toolsets
|
||||
|
||||
Tools are functions that extend the agent's capabilities. They're organized into logical **toolsets** that can be enabled or disabled per platform.
|
||||
|
||||
## Available Tools
|
||||
|
||||
Hermes ships with a broad built-in tool registry covering web search, browser automation, terminal execution, file editing, memory, delegation, RL training, messaging delivery, Home Assistant, Honcho memory, and more.
|
||||
|
||||
High-level categories:
|
||||
|
||||
| Category | Examples | Description |
|
||||
|----------|----------|-------------|
|
||||
| **Web** | `web_search`, `web_extract` | Search the web and extract page content. |
|
||||
| **Terminal & Files** | `terminal`, `process`, `read_file`, `patch` | Execute commands and manipulate files. |
|
||||
| **Browser** | `browser_navigate`, `browser_snapshot`, `browser_vision` | Interactive browser automation with text and vision support. |
|
||||
| **Media** | `vision_analyze`, `image_generate`, `text_to_speech` | Multimodal analysis and generation. |
|
||||
| **Agent orchestration** | `todo`, `clarify`, `execute_code`, `delegate_task` | Planning, clarification, code execution, and subagent delegation. |
|
||||
| **Memory & recall** | `memory`, `session_search`, `honcho_*` | Persistent memory, session search, and Honcho cross-session context. |
|
||||
| **Automation & delivery** | `cronjob`, `send_message` | Scheduled tasks with create/list/update/pause/resume/run/remove actions, plus outbound messaging delivery. |
|
||||
| **Integrations** | `ha_*`, MCP server tools, `rl_*` | Home Assistant, MCP, RL training, and other integrations. |
|
||||
|
||||
For the authoritative code-derived registry, see [Built-in Tools Reference](/docs/reference/tools-reference) and [Toolsets Reference](/docs/reference/toolsets-reference).
|
||||
|
||||
## Using Toolsets
|
||||
|
||||
```bash
|
||||
# Use specific toolsets
|
||||
hermes chat --toolsets "web,terminal"
|
||||
|
||||
# See all available tools
|
||||
hermes tools
|
||||
|
||||
# Configure tools per platform (interactive)
|
||||
hermes tools
|
||||
```
|
||||
|
||||
Common toolsets include `web`, `terminal`, `file`, `browser`, `vision`, `image_gen`, `moa`, `skills`, `tts`, `todo`, `memory`, `session_search`, `cronjob`, `code_execution`, `delegation`, `clarify`, `honcho`, `homeassistant`, and `rl`.
|
||||
|
||||
See [Toolsets Reference](/docs/reference/toolsets-reference) for the full set, including platform presets such as `hermes-cli`, `hermes-telegram`, and dynamic MCP toolsets like `mcp-<server>`.
|
||||
|
||||
## Terminal Backends
|
||||
|
||||
The terminal tool can execute commands in different environments:
|
||||
|
||||
| Backend | Description | Use Case |
|
||||
|---------|-------------|----------|
|
||||
| `local` | Run on your machine (default) | Development, trusted tasks |
|
||||
| `docker` | Isolated containers | Security, reproducibility |
|
||||
| `ssh` | Remote server | Sandboxing, keep agent away from its own code |
|
||||
| `singularity` | HPC containers | Cluster computing, rootless |
|
||||
| `modal` | Cloud execution | Serverless, scale |
|
||||
| `daytona` | Cloud sandbox workspace | Persistent remote dev environments |
|
||||
|
||||
### Configuration
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
terminal:
|
||||
backend: local # or: docker, ssh, singularity, modal, daytona
|
||||
cwd: "." # Working directory
|
||||
timeout: 180 # Command timeout in seconds
|
||||
```
|
||||
|
||||
### Docker Backend
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: docker
|
||||
docker_image: python:3.11-slim
|
||||
```
|
||||
|
||||
### SSH Backend
|
||||
|
||||
Recommended for security — agent can't modify its own code:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: ssh
|
||||
```
|
||||
```bash
|
||||
# Set credentials in ~/.hermes/.env
|
||||
TERMINAL_SSH_HOST=my-server.example.com
|
||||
TERMINAL_SSH_USER=myuser
|
||||
TERMINAL_SSH_KEY=~/.ssh/id_rsa
|
||||
```
|
||||
|
||||
### Singularity/Apptainer
|
||||
|
||||
```bash
|
||||
# Pre-build SIF for parallel workers
|
||||
apptainer build ~/python.sif docker://python:3.11-slim
|
||||
|
||||
# Configure
|
||||
hermes config set terminal.backend singularity
|
||||
hermes config set terminal.singularity_image ~/python.sif
|
||||
```
|
||||
|
||||
### Modal (Serverless Cloud)
|
||||
|
||||
```bash
|
||||
uv pip install "swe-rex[modal]"
|
||||
modal setup
|
||||
hermes config set terminal.backend modal
|
||||
```
|
||||
|
||||
### Container Resources
|
||||
|
||||
Configure CPU, memory, disk, and persistence for all container backends:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: docker # or singularity, modal, daytona
|
||||
container_cpu: 1 # CPU cores (default: 1)
|
||||
container_memory: 5120 # Memory in MB (default: 5GB)
|
||||
container_disk: 51200 # Disk in MB (default: 50GB)
|
||||
container_persistent: true # Persist filesystem across sessions (default: true)
|
||||
```
|
||||
|
||||
When `container_persistent: true`, installed packages, files, and config survive across sessions.
|
||||
|
||||
### Container Security
|
||||
|
||||
All container backends run with security hardening:
|
||||
|
||||
- Read-only root filesystem (Docker)
|
||||
- All Linux capabilities dropped
|
||||
- No privilege escalation
|
||||
- PID limits (256 processes)
|
||||
- Full namespace isolation
|
||||
- Persistent workspace via volumes, not writable root layer
|
||||
|
||||
Docker can optionally receive an explicit env allowlist via `terminal.docker_forward_env`, but forwarded variables are visible to commands inside the container and should be treated as exposed to that session.
|
||||
|
||||
## Background Process Management
|
||||
|
||||
Start background processes and manage them:
|
||||
|
||||
```python
|
||||
terminal(command="pytest -v tests/", background=true)
|
||||
# Returns: {"session_id": "proc_abc123", "pid": 12345}
|
||||
|
||||
# Then manage with the process tool:
|
||||
process(action="list") # Show all running processes
|
||||
process(action="poll", session_id="proc_abc123") # Check status
|
||||
process(action="wait", session_id="proc_abc123") # Block until done
|
||||
process(action="log", session_id="proc_abc123") # Full output
|
||||
process(action="kill", session_id="proc_abc123") # Terminate
|
||||
process(action="write", session_id="proc_abc123", data="y") # Send input
|
||||
```
|
||||
|
||||
PTY mode (`pty=true`) enables interactive CLI tools like Codex and Claude Code.
|
||||
|
||||
## Sudo Support
|
||||
|
||||
If a command needs sudo, you'll be prompted for your password (cached for the session). Or set `SUDO_PASSWORD` in `~/.hermes/.env`.
|
||||
|
||||
:::warning
|
||||
On messaging platforms, if sudo fails, the output includes a tip to add `SUDO_PASSWORD` to `~/.hermes/.env`.
|
||||
:::
|
||||
128
hermes_code/website/docs/user-guide/features/tts.md
Normal file
128
hermes_code/website/docs/user-guide/features/tts.md
Normal file
|
|
@ -0,0 +1,128 @@
|
|||
---
|
||||
sidebar_position: 9
|
||||
title: "Voice & TTS"
|
||||
description: "Text-to-speech and voice message transcription across all platforms"
|
||||
---
|
||||
|
||||
# Voice & TTS
|
||||
|
||||
Hermes Agent supports both text-to-speech output and voice message transcription across all messaging platforms.
|
||||
|
||||
## Text-to-Speech
|
||||
|
||||
Convert text to speech with four providers:
|
||||
|
||||
| Provider | Quality | Cost | API Key |
|
||||
|----------|---------|------|---------|
|
||||
| **Edge TTS** (default) | Good | Free | None needed |
|
||||
| **ElevenLabs** | Excellent | Paid | `ELEVENLABS_API_KEY` |
|
||||
| **OpenAI TTS** | Good | Paid | `VOICE_TOOLS_OPENAI_KEY` |
|
||||
| **NeuTTS** | Good | Free | None needed |
|
||||
|
||||
### Platform Delivery
|
||||
|
||||
| Platform | Delivery | Format |
|
||||
|----------|----------|--------|
|
||||
| Telegram | Voice bubble (plays inline) | Opus `.ogg` |
|
||||
| Discord | Voice bubble (Opus/OGG), falls back to file attachment | Opus/MP3 |
|
||||
| WhatsApp | Audio file attachment | MP3 |
|
||||
| CLI | Saved to `~/.hermes/audio_cache/` | MP3 |
|
||||
|
||||
### Configuration
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
tts:
|
||||
provider: "edge" # "edge" | "elevenlabs" | "openai" | "neutts"
|
||||
edge:
|
||||
voice: "en-US-AriaNeural" # 322 voices, 74 languages
|
||||
elevenlabs:
|
||||
voice_id: "pNInz6obpgDQGcFmaJgB" # Adam
|
||||
model_id: "eleven_multilingual_v2"
|
||||
openai:
|
||||
model: "gpt-4o-mini-tts"
|
||||
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
|
||||
base_url: "https://api.openai.com/v1" # Override for OpenAI-compatible TTS endpoints
|
||||
neutts:
|
||||
ref_audio: ''
|
||||
ref_text: ''
|
||||
model: neuphonic/neutts-air-q4-gguf
|
||||
device: cpu
|
||||
```
|
||||
|
||||
### Telegram Voice Bubbles & ffmpeg
|
||||
|
||||
Telegram voice bubbles require Opus/OGG audio format:
|
||||
|
||||
- **OpenAI and ElevenLabs** produce Opus natively — no extra setup
|
||||
- **Edge TTS** (default) outputs MP3 and needs **ffmpeg** to convert:
|
||||
- **NeuTTS** outputs WAV and also needs **ffmpeg** to convert for Telegram voice bubbles
|
||||
|
||||
```bash
|
||||
# Ubuntu/Debian
|
||||
sudo apt install ffmpeg
|
||||
|
||||
# macOS
|
||||
brew install ffmpeg
|
||||
|
||||
# Fedora
|
||||
sudo dnf install ffmpeg
|
||||
```
|
||||
|
||||
Without ffmpeg, Edge TTS and NeuTTS audio are sent as regular audio files (playable, but shown as a rectangular player instead of a voice bubble).
|
||||
|
||||
:::tip
|
||||
If you want voice bubbles without installing ffmpeg, switch to the OpenAI or ElevenLabs provider.
|
||||
:::
|
||||
|
||||
## Voice Message Transcription (STT)
|
||||
|
||||
Voice messages sent on Telegram, Discord, WhatsApp, Slack, or Signal are automatically transcribed and injected as text into the conversation. The agent sees the transcript as normal text.
|
||||
|
||||
| Provider | Quality | Cost | API Key |
|
||||
|----------|---------|------|---------|
|
||||
| **Local Whisper** (default) | Good | Free | None needed |
|
||||
| **Groq Whisper API** | Good–Best | Free tier | `GROQ_API_KEY` |
|
||||
| **OpenAI Whisper API** | Good–Best | Paid | `VOICE_TOOLS_OPENAI_KEY` or `OPENAI_API_KEY` |
|
||||
|
||||
:::info Zero Config
|
||||
Local transcription works out of the box when `faster-whisper` is installed. If that's unavailable, Hermes can also use a local `whisper` CLI from common install locations (like `/opt/homebrew/bin`) or a custom command via `HERMES_LOCAL_STT_COMMAND`.
|
||||
:::
|
||||
|
||||
### Configuration
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
stt:
|
||||
provider: "local" # "local" | "groq" | "openai"
|
||||
local:
|
||||
model: "base" # tiny, base, small, medium, large-v3
|
||||
openai:
|
||||
model: "whisper-1" # whisper-1, gpt-4o-mini-transcribe, gpt-4o-transcribe
|
||||
```
|
||||
|
||||
### Provider Details
|
||||
|
||||
**Local (faster-whisper)** — Runs Whisper locally via [faster-whisper](https://github.com/SYSTRAN/faster-whisper). Uses CPU by default, GPU if available. Model sizes:
|
||||
|
||||
| Model | Size | Speed | Quality |
|
||||
|-------|------|-------|---------|
|
||||
| `tiny` | ~75 MB | Fastest | Basic |
|
||||
| `base` | ~150 MB | Fast | Good (default) |
|
||||
| `small` | ~500 MB | Medium | Better |
|
||||
| `medium` | ~1.5 GB | Slower | Great |
|
||||
| `large-v3` | ~3 GB | Slowest | Best |
|
||||
|
||||
**Groq API** — Requires `GROQ_API_KEY`. Good cloud fallback when you want a free hosted STT option.
|
||||
|
||||
**OpenAI API** — Accepts `VOICE_TOOLS_OPENAI_KEY` first and falls back to `OPENAI_API_KEY`. Supports `whisper-1`, `gpt-4o-mini-transcribe`, and `gpt-4o-transcribe`.
|
||||
|
||||
**Custom local CLI fallback** — Set `HERMES_LOCAL_STT_COMMAND` if you want Hermes to call a local transcription command directly. The command template supports `{input_path}`, `{output_dir}`, `{language}`, and `{model}` placeholders.
|
||||
|
||||
### Fallback Behavior
|
||||
|
||||
If your configured provider isn't available, Hermes automatically falls back:
|
||||
- **Local faster-whisper unavailable** → Tries a local `whisper` CLI or `HERMES_LOCAL_STT_COMMAND` before cloud providers
|
||||
- **Groq key not set** → Falls back to local transcription, then OpenAI
|
||||
- **OpenAI key not set** → Falls back to local transcription, then Groq
|
||||
- **Nothing available** → Voice messages pass through with an accurate note to the user
|
||||
187
hermes_code/website/docs/user-guide/features/vision.md
Normal file
187
hermes_code/website/docs/user-guide/features/vision.md
Normal file
|
|
@ -0,0 +1,187 @@
|
|||
---
|
||||
title: Vision & Image Paste
|
||||
description: Paste images from your clipboard into the Hermes CLI for multimodal vision analysis.
|
||||
sidebar_label: Vision & Image Paste
|
||||
sidebar_position: 7
|
||||
---
|
||||
|
||||
# Vision & Image Paste
|
||||
|
||||
Hermes Agent supports **multimodal vision** — you can paste images from your clipboard directly into the CLI and ask the agent to analyze, describe, or work with them. Images are sent to the model as base64-encoded content blocks, so any vision-capable model can process them.
|
||||
|
||||
## How It Works
|
||||
|
||||
1. Copy an image to your clipboard (screenshot, browser image, etc.)
|
||||
2. Attach it using one of the methods below
|
||||
3. Type your question and press Enter
|
||||
4. The image appears as a `[📎 Image #1]` badge above the input
|
||||
5. On submit, the image is sent to the model as a vision content block
|
||||
|
||||
You can attach multiple images before sending — each gets its own badge. Press `Ctrl+C` to clear all attached images.
|
||||
|
||||
Images are saved to `~/.hermes/images/` as PNG files with timestamped filenames.
|
||||
|
||||
## Paste Methods
|
||||
|
||||
How you attach an image depends on your terminal environment. Not all methods work everywhere — here's the full breakdown:
|
||||
|
||||
### `/paste` Command
|
||||
|
||||
**The most reliable method. Works everywhere.**
|
||||
|
||||
```
|
||||
/paste
|
||||
```
|
||||
|
||||
Type `/paste` and press Enter. Hermes checks your clipboard for an image and attaches it. This works in every environment because it explicitly calls the clipboard backend — no terminal keybinding interception to worry about.
|
||||
|
||||
### Ctrl+V / Cmd+V (Bracketed Paste)
|
||||
|
||||
When you paste text that's on the clipboard alongside an image, Hermes automatically checks for an image too. This works when:
|
||||
- Your clipboard contains **both text and an image** (some apps put both on the clipboard when you copy)
|
||||
- Your terminal supports bracketed paste (most modern terminals do)
|
||||
|
||||
:::warning
|
||||
If your clipboard has **only an image** (no text), Ctrl+V does nothing in most terminals. Terminals can only paste text — there's no standard mechanism to paste binary image data. Use `/paste` or Alt+V instead.
|
||||
:::
|
||||
|
||||
### Alt+V
|
||||
|
||||
Alt key combinations pass through most terminal emulators (they're sent as ESC + key rather than being intercepted). Press `Alt+V` to check the clipboard for an image.
|
||||
|
||||
:::caution
|
||||
**Does not work in VSCode's integrated terminal.** VSCode intercepts many Alt+key combos for its own UI. Use `/paste` instead.
|
||||
:::
|
||||
|
||||
### Ctrl+V (Raw — Linux Only)
|
||||
|
||||
On Linux desktop terminals (GNOME Terminal, Konsole, Alacritty, etc.), `Ctrl+V` is **not** the paste shortcut — `Ctrl+Shift+V` is. So `Ctrl+V` sends a raw byte to the application, and Hermes catches it to check the clipboard. This only works on Linux desktop terminals with X11 or Wayland clipboard access.
|
||||
|
||||
## Platform Compatibility
|
||||
|
||||
| Environment | `/paste` | Ctrl+V text+image | Alt+V | Notes |
|
||||
|---|:---:|:---:|:---:|---|
|
||||
| **macOS Terminal / iTerm2** | ✅ | ✅ | ✅ | Best experience — `osascript` always available |
|
||||
| **Linux X11 desktop** | ✅ | ✅ | ✅ | Requires `xclip` (`apt install xclip`) |
|
||||
| **Linux Wayland desktop** | ✅ | ✅ | ✅ | Requires `wl-paste` (`apt install wl-clipboard`) |
|
||||
| **WSL2 (Windows Terminal)** | ✅ | ✅¹ | ✅ | Uses `powershell.exe` — no extra install needed |
|
||||
| **VSCode Terminal (local)** | ✅ | ✅¹ | ❌ | VSCode intercepts Alt+key |
|
||||
| **VSCode Terminal (SSH)** | ❌² | ❌² | ❌ | Remote clipboard not accessible |
|
||||
| **SSH terminal (any)** | ❌² | ❌² | ❌² | Remote clipboard not accessible |
|
||||
|
||||
¹ Only when clipboard has both text and an image (image-only clipboard = nothing happens)
|
||||
² See [SSH & Remote Sessions](#ssh--remote-sessions) below
|
||||
|
||||
## Platform-Specific Setup
|
||||
|
||||
### macOS
|
||||
|
||||
**No setup required.** Hermes uses `osascript` (built into macOS) to read the clipboard. For faster performance, optionally install `pngpaste`:
|
||||
|
||||
```bash
|
||||
brew install pngpaste
|
||||
```
|
||||
|
||||
### Linux (X11)
|
||||
|
||||
Install `xclip`:
|
||||
|
||||
```bash
|
||||
# Ubuntu/Debian
|
||||
sudo apt install xclip
|
||||
|
||||
# Fedora
|
||||
sudo dnf install xclip
|
||||
|
||||
# Arch
|
||||
sudo pacman -S xclip
|
||||
```
|
||||
|
||||
### Linux (Wayland)
|
||||
|
||||
Modern Linux desktops (Ubuntu 22.04+, Fedora 34+) often use Wayland by default. Install `wl-clipboard`:
|
||||
|
||||
```bash
|
||||
# Ubuntu/Debian
|
||||
sudo apt install wl-clipboard
|
||||
|
||||
# Fedora
|
||||
sudo dnf install wl-clipboard
|
||||
|
||||
# Arch
|
||||
sudo pacman -S wl-clipboard
|
||||
```
|
||||
|
||||
:::tip How to check if you're on Wayland
|
||||
```bash
|
||||
echo $XDG_SESSION_TYPE
|
||||
# "wayland" = Wayland, "x11" = X11, "tty" = no display server
|
||||
```
|
||||
:::
|
||||
|
||||
### WSL2
|
||||
|
||||
**No extra setup required.** Hermes detects WSL2 automatically (via `/proc/version`) and uses `powershell.exe` to access the Windows clipboard through .NET's `System.Windows.Forms.Clipboard`. This is built into WSL2's Windows interop — `powershell.exe` is available by default.
|
||||
|
||||
The clipboard data is transferred as base64-encoded PNG over stdout, so no file path conversion or temp files are needed.
|
||||
|
||||
:::info WSLg Note
|
||||
If you're running WSLg (WSL2 with GUI support), Hermes tries the PowerShell path first, then falls back to `wl-paste`. WSLg's clipboard bridge only supports BMP format for images — Hermes auto-converts BMP to PNG using Pillow (if installed) or ImageMagick's `convert` command.
|
||||
:::
|
||||
|
||||
#### Verify WSL2 clipboard access
|
||||
|
||||
```bash
|
||||
# 1. Check WSL detection
|
||||
grep -i microsoft /proc/version
|
||||
|
||||
# 2. Check PowerShell is accessible
|
||||
which powershell.exe
|
||||
|
||||
# 3. Copy an image, then check
|
||||
powershell.exe -NoProfile -Command "Add-Type -AssemblyName System.Windows.Forms; [System.Windows.Forms.Clipboard]::ContainsImage()"
|
||||
# Should print "True"
|
||||
```
|
||||
|
||||
## SSH & Remote Sessions
|
||||
|
||||
**Clipboard paste does not work over SSH.** When you SSH into a remote machine, the Hermes CLI runs on the remote host. All clipboard tools (`xclip`, `wl-paste`, `powershell.exe`, `osascript`) read the clipboard of the machine they run on — which is the remote server, not your local machine. Your local clipboard is inaccessible from the remote side.
|
||||
|
||||
### Workarounds for SSH
|
||||
|
||||
1. **Upload the image file** — Save the image locally, upload it to the remote server via `scp`, VSCode's file explorer (drag-and-drop), or any file transfer method. Then reference it by path. *(A `/attach <filepath>` command is planned for a future release.)*
|
||||
|
||||
2. **Use a URL** — If the image is accessible online, just paste the URL in your message. The agent can use `vision_analyze` to look at any image URL directly.
|
||||
|
||||
3. **X11 forwarding** — Connect with `ssh -X` to forward X11. This lets `xclip` on the remote machine access your local X11 clipboard. Requires an X server running locally (XQuartz on macOS, built-in on Linux X11 desktops). Slow for large images.
|
||||
|
||||
4. **Use a messaging platform** — Send images to Hermes via Telegram, Discord, Slack, or WhatsApp. These platforms handle image upload natively and are not affected by clipboard/terminal limitations.
|
||||
|
||||
## Why Terminals Can't Paste Images
|
||||
|
||||
This is a common source of confusion, so here's the technical explanation:
|
||||
|
||||
Terminals are **text-based** interfaces. When you press Ctrl+V (or Cmd+V), the terminal emulator:
|
||||
|
||||
1. Reads the clipboard for **text content**
|
||||
2. Wraps it in [bracketed paste](https://en.wikipedia.org/wiki/Bracketed-paste) escape sequences
|
||||
3. Sends it to the application through the terminal's text stream
|
||||
|
||||
If the clipboard contains only an image (no text), the terminal has nothing to send. There is no standard terminal escape sequence for binary image data. The terminal simply does nothing.
|
||||
|
||||
This is why Hermes uses a separate clipboard check — instead of receiving image data through the terminal paste event, it calls OS-level tools (`osascript`, `powershell.exe`, `xclip`, `wl-paste`) directly via subprocess to read the clipboard independently.
|
||||
|
||||
## Supported Models
|
||||
|
||||
Image paste works with any vision-capable model. The image is sent as a base64-encoded data URL in the OpenAI vision content format:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "image_url",
|
||||
"image_url": {
|
||||
"url": "data:image/png;base64,..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Most modern models support this format, including GPT-4 Vision, Claude (with vision), Gemini, and open-source multimodal models served through OpenRouter.
|
||||
508
hermes_code/website/docs/user-guide/features/voice-mode.md
Normal file
508
hermes_code/website/docs/user-guide/features/voice-mode.md
Normal file
|
|
@ -0,0 +1,508 @@
|
|||
---
|
||||
sidebar_position: 10
|
||||
title: "Voice Mode"
|
||||
description: "Real-time voice conversations with Hermes Agent — CLI, Telegram, Discord (DMs, text channels, and voice channels)"
|
||||
---
|
||||
|
||||
# Voice Mode
|
||||
|
||||
Hermes Agent supports full voice interaction across CLI and messaging platforms. Talk to the agent using your microphone, hear spoken replies, and have live voice conversations in Discord voice channels.
|
||||
|
||||
If you want a practical setup walkthrough with recommended configurations and real usage patterns, see [Use Voice Mode with Hermes](/docs/guides/use-voice-mode-with-hermes).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before using voice features, make sure you have:
|
||||
|
||||
1. **Hermes Agent installed** — `pip install hermes-agent` (see [Installation](/docs/getting-started/installation))
|
||||
2. **An LLM provider configured** — run `hermes model` or set your preferred provider credentials in `~/.hermes/.env`
|
||||
3. **A working base setup** — run `hermes` to verify the agent responds to text before enabling voice
|
||||
|
||||
:::tip
|
||||
The `~/.hermes/` directory and default `config.yaml` are created automatically the first time you run `hermes`. You only need to create `~/.hermes/.env` manually for API keys.
|
||||
:::
|
||||
|
||||
## Overview
|
||||
|
||||
| Feature | Platform | Description |
|
||||
|---------|----------|-------------|
|
||||
| **Interactive Voice** | CLI | Press Ctrl+B to record, agent auto-detects silence and responds |
|
||||
| **Auto Voice Reply** | Telegram, Discord | Agent sends spoken audio alongside text responses |
|
||||
| **Voice Channel** | Discord | Bot joins VC, listens to users speaking, speaks replies back |
|
||||
|
||||
## Requirements
|
||||
|
||||
### Python Packages
|
||||
|
||||
```bash
|
||||
# CLI voice mode (microphone + audio playback)
|
||||
pip install "hermes-agent[voice]"
|
||||
|
||||
# Discord + Telegram messaging (includes discord.py[voice] for VC support)
|
||||
pip install "hermes-agent[messaging]"
|
||||
|
||||
# Premium TTS (ElevenLabs)
|
||||
pip install "hermes-agent[tts-premium]"
|
||||
|
||||
# Local TTS (NeuTTS, optional)
|
||||
python -m pip install -U neutts[all]
|
||||
|
||||
# Everything at once
|
||||
pip install "hermes-agent[all]"
|
||||
```
|
||||
|
||||
| Extra | Packages | Required For |
|
||||
|-------|----------|-------------|
|
||||
| `voice` | `sounddevice`, `numpy` | CLI voice mode |
|
||||
| `messaging` | `discord.py[voice]`, `python-telegram-bot`, `aiohttp` | Discord & Telegram bots |
|
||||
| `tts-premium` | `elevenlabs` | ElevenLabs TTS provider |
|
||||
|
||||
Optional local TTS provider: install `neutts` separately with `python -m pip install -U neutts[all]`. On first use it downloads the model automatically.
|
||||
|
||||
:::info
|
||||
`discord.py[voice]` installs **PyNaCl** (for voice encryption) and **opus bindings** automatically. This is required for Discord voice channel support.
|
||||
:::
|
||||
|
||||
### System Dependencies
|
||||
|
||||
```bash
|
||||
# macOS
|
||||
brew install portaudio ffmpeg opus
|
||||
brew install espeak-ng # for NeuTTS
|
||||
|
||||
# Ubuntu/Debian
|
||||
sudo apt install portaudio19-dev ffmpeg libopus0
|
||||
sudo apt install espeak-ng # for NeuTTS
|
||||
```
|
||||
|
||||
| Dependency | Purpose | Required For |
|
||||
|-----------|---------|-------------|
|
||||
| **PortAudio** | Microphone input and audio playback | CLI voice mode |
|
||||
| **ffmpeg** | Audio format conversion (MP3 → Opus, PCM → WAV) | All platforms |
|
||||
| **Opus** | Discord voice codec | Discord voice channels |
|
||||
| **espeak-ng** | Phonemizer backend | Local NeuTTS provider |
|
||||
|
||||
### API Keys
|
||||
|
||||
Add to `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
# Speech-to-Text — local provider needs NO key at all
|
||||
# pip install faster-whisper # Free, runs locally, recommended
|
||||
GROQ_API_KEY=your-key # Groq Whisper — fast, free tier (cloud)
|
||||
VOICE_TOOLS_OPENAI_KEY=your-key # OpenAI Whisper — paid (cloud)
|
||||
|
||||
# Text-to-Speech (optional — Edge TTS and NeuTTS work without any key)
|
||||
ELEVENLABS_API_KEY=*** # ElevenLabs — premium quality
|
||||
# VOICE_TOOLS_OPENAI_KEY above also enables OpenAI TTS
|
||||
```
|
||||
|
||||
:::tip
|
||||
If `faster-whisper` is installed, voice mode works with **zero API keys** for STT. The model (~150 MB for `base`) downloads automatically on first use.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## CLI Voice Mode
|
||||
|
||||
### Quick Start
|
||||
|
||||
Start the CLI and enable voice mode:
|
||||
|
||||
```bash
|
||||
hermes # Start the interactive CLI
|
||||
```
|
||||
|
||||
Then use these commands inside the CLI:
|
||||
|
||||
```
|
||||
/voice Toggle voice mode on/off
|
||||
/voice on Enable voice mode
|
||||
/voice off Disable voice mode
|
||||
/voice tts Toggle TTS output
|
||||
/voice status Show current state
|
||||
```
|
||||
|
||||
### How It Works
|
||||
|
||||
1. Start the CLI with `hermes` and enable voice mode with `/voice on`
|
||||
2. **Press Ctrl+B** — a beep plays (880Hz), recording starts
|
||||
3. **Speak** — a live audio level bar shows your input: `● [▁▂▃▅▇▇▅▂] ❯`
|
||||
4. **Stop speaking** — after 3 seconds of silence, recording auto-stops
|
||||
5. **Two beeps** play (660Hz) confirming the recording ended
|
||||
6. Audio is transcribed via Whisper and sent to the agent
|
||||
7. If TTS is enabled, the agent's reply is spoken aloud
|
||||
8. Recording **automatically restarts** — speak again without pressing any key
|
||||
|
||||
This loop continues until you press **Ctrl+B** during recording (exits continuous mode) or 3 consecutive recordings detect no speech.
|
||||
|
||||
:::tip
|
||||
The record key is configurable via `voice.record_key` in `~/.hermes/config.yaml` (default: `ctrl+b`).
|
||||
:::
|
||||
|
||||
### Silence Detection
|
||||
|
||||
Two-stage algorithm detects when you've finished speaking:
|
||||
|
||||
1. **Speech confirmation** — waits for audio above the RMS threshold (200) for at least 0.3s, tolerating brief dips between syllables
|
||||
2. **End detection** — once speech is confirmed, triggers after 3.0 seconds of continuous silence
|
||||
|
||||
If no speech is detected at all for 15 seconds, recording stops automatically.
|
||||
|
||||
Both `silence_threshold` and `silence_duration` are configurable in `config.yaml`.
|
||||
|
||||
### Streaming TTS
|
||||
|
||||
When TTS is enabled, the agent speaks its reply **sentence-by-sentence** as it generates text — you don't wait for the full response:
|
||||
|
||||
1. Buffers text deltas into complete sentences (min 20 chars)
|
||||
2. Strips markdown formatting and `<think>` blocks
|
||||
3. Generates and plays audio per sentence in real-time
|
||||
|
||||
### Hallucination Filter
|
||||
|
||||
Whisper sometimes generates phantom text from silence or background noise ("Thank you for watching", "Subscribe", etc.). The agent filters these out using a set of 26 known hallucination phrases across multiple languages, plus a regex pattern that catches repetitive variations.
|
||||
|
||||
---
|
||||
|
||||
## Gateway Voice Reply (Telegram & Discord)
|
||||
|
||||
If you haven't set up your messaging bots yet, see the platform-specific guides:
|
||||
- [Telegram Setup Guide](../messaging/telegram.md)
|
||||
- [Discord Setup Guide](../messaging/discord.md)
|
||||
|
||||
Start the gateway to connect to your messaging platforms:
|
||||
|
||||
```bash
|
||||
hermes gateway # Start the gateway (connects to configured platforms)
|
||||
hermes gateway setup # Interactive setup wizard for first-time configuration
|
||||
```
|
||||
|
||||
### Discord: Channels vs DMs
|
||||
|
||||
The bot supports two interaction modes on Discord:
|
||||
|
||||
| Mode | How to Talk | Mention Required | Setup |
|
||||
|------|------------|-----------------|-------|
|
||||
| **Direct Message (DM)** | Open the bot's profile → "Message" | No | Works immediately |
|
||||
| **Server Channel** | Type in a text channel where the bot is present | Yes (`@botname`) | Bot must be invited to the server |
|
||||
|
||||
**DM (recommended for personal use):** Just open a DM with the bot and type — no @mention needed. Voice replies and all commands work the same as in channels.
|
||||
|
||||
**Server channels:** The bot only responds when you @mention it (e.g. `@hermesbyt4 hello`). Make sure you select the **bot user** from the mention popup, not the role with the same name.
|
||||
|
||||
:::tip
|
||||
To disable the mention requirement in server channels, add to `~/.hermes/.env`:
|
||||
```bash
|
||||
DISCORD_REQUIRE_MENTION=false
|
||||
```
|
||||
Or set specific channels as free-response (no mention needed):
|
||||
```bash
|
||||
DISCORD_FREE_RESPONSE_CHANNELS=123456789,987654321
|
||||
```
|
||||
:::
|
||||
|
||||
### Commands
|
||||
|
||||
These work in both Telegram and Discord (DMs and text channels):
|
||||
|
||||
```
|
||||
/voice Toggle voice mode on/off
|
||||
/voice on Voice replies only when you send a voice message
|
||||
/voice tts Voice replies for ALL messages
|
||||
/voice off Disable voice replies
|
||||
/voice status Show current setting
|
||||
```
|
||||
|
||||
### Modes
|
||||
|
||||
| Mode | Command | Behavior |
|
||||
|------|---------|----------|
|
||||
| `off` | `/voice off` | Text only (default) |
|
||||
| `voice_only` | `/voice on` | Speaks reply only when you send a voice message |
|
||||
| `all` | `/voice tts` | Speaks reply to every message |
|
||||
|
||||
Voice mode setting is persisted across gateway restarts.
|
||||
|
||||
### Platform Delivery
|
||||
|
||||
| Platform | Format | Notes |
|
||||
|----------|--------|-------|
|
||||
| **Telegram** | Voice bubble (Opus/OGG) | Plays inline in chat. ffmpeg converts MP3 → Opus if needed |
|
||||
| **Discord** | Native voice bubble (Opus/OGG) | Plays inline like a user voice message. Falls back to file attachment if voice bubble API fails |
|
||||
|
||||
---
|
||||
|
||||
## Discord Voice Channels
|
||||
|
||||
The most immersive voice feature: the bot joins a Discord voice channel, listens to users speaking, transcribes their speech, processes through the agent, and speaks the reply back in the voice channel.
|
||||
|
||||
### Setup
|
||||
|
||||
#### 1. Discord Bot Permissions
|
||||
|
||||
If you already have a Discord bot set up for text (see [Discord Setup Guide](../messaging/discord.md)), you need to add voice permissions.
|
||||
|
||||
Go to the [Discord Developer Portal](https://discord.com/developers/applications) → your application → **Installation** → **Default Install Settings** → **Guild Install**:
|
||||
|
||||
**Add these permissions to the existing text permissions:**
|
||||
|
||||
| Permission | Purpose | Required |
|
||||
|-----------|---------|----------|
|
||||
| **Connect** | Join voice channels | Yes |
|
||||
| **Speak** | Play TTS audio in voice channels | Yes |
|
||||
| **Use Voice Activity** | Detect when users are speaking | Recommended |
|
||||
|
||||
**Updated Permissions Integer:**
|
||||
|
||||
| Level | Integer | What's Included |
|
||||
|-------|---------|----------------|
|
||||
| Text only | `274878286912` | View Channels, Send Messages, Read History, Embeds, Attachments, Threads, Reactions |
|
||||
| Text + Voice | `274881432640` | All above + Connect, Speak |
|
||||
|
||||
**Re-invite the bot** with the updated permissions URL:
|
||||
|
||||
```
|
||||
https://discord.com/oauth2/authorize?client_id=YOUR_APP_ID&scope=bot+applications.commands&permissions=274881432640
|
||||
```
|
||||
|
||||
Replace `YOUR_APP_ID` with your Application ID from the Developer Portal.
|
||||
|
||||
:::warning
|
||||
Re-inviting the bot to a server it's already in will update its permissions without removing it. You won't lose any data or configuration.
|
||||
:::
|
||||
|
||||
#### 2. Privileged Gateway Intents
|
||||
|
||||
In the [Developer Portal](https://discord.com/developers/applications) → your application → **Bot** → **Privileged Gateway Intents**, enable all three:
|
||||
|
||||
| Intent | Purpose |
|
||||
|--------|---------|
|
||||
| **Presence Intent** | Detect user online/offline status |
|
||||
| **Server Members Intent** | Map voice SSRC identifiers to Discord user IDs |
|
||||
| **Message Content Intent** | Read text message content in channels |
|
||||
|
||||
All three are required for full voice channel functionality. **Server Members Intent** is especially critical — without it, the bot cannot identify who is speaking in the voice channel.
|
||||
|
||||
#### 3. Opus Codec
|
||||
|
||||
The Opus codec library must be installed on the machine running the gateway:
|
||||
|
||||
```bash
|
||||
# macOS (Homebrew)
|
||||
brew install opus
|
||||
|
||||
# Ubuntu/Debian
|
||||
sudo apt install libopus0
|
||||
```
|
||||
|
||||
The bot auto-loads the codec from:
|
||||
- **macOS:** `/opt/homebrew/lib/libopus.dylib`
|
||||
- **Linux:** `libopus.so.0`
|
||||
|
||||
#### 4. Environment Variables
|
||||
|
||||
```bash
|
||||
# ~/.hermes/.env
|
||||
|
||||
# Discord bot (already configured for text)
|
||||
DISCORD_BOT_TOKEN=your-bot-token
|
||||
DISCORD_ALLOWED_USERS=your-user-id
|
||||
|
||||
# STT — local provider needs no key (pip install faster-whisper)
|
||||
# GROQ_API_KEY=your-key # Alternative: cloud-based, fast, free tier
|
||||
|
||||
# TTS — optional. Edge TTS and NeuTTS need no key.
|
||||
# ELEVENLABS_API_KEY=*** # Premium quality
|
||||
# VOICE_TOOLS_OPENAI_KEY=*** # OpenAI TTS / Whisper
|
||||
```
|
||||
|
||||
### Start the Gateway
|
||||
|
||||
```bash
|
||||
hermes gateway # Start with existing configuration
|
||||
```
|
||||
|
||||
The bot should come online in Discord within a few seconds.
|
||||
|
||||
### Commands
|
||||
|
||||
Use these in the Discord text channel where the bot is present:
|
||||
|
||||
```
|
||||
/voice join Bot joins your current voice channel
|
||||
/voice channel Alias for /voice join
|
||||
/voice leave Bot disconnects from voice channel
|
||||
/voice status Show voice mode and connected channel
|
||||
```
|
||||
|
||||
:::info
|
||||
You must be in a voice channel before running `/voice join`. The bot joins the same VC you're in.
|
||||
:::
|
||||
|
||||
### How It Works
|
||||
|
||||
When the bot joins a voice channel, it:
|
||||
|
||||
1. **Listens** to each user's audio stream independently
|
||||
2. **Detects silence** — 1.5s of silence after at least 0.5s of speech triggers processing
|
||||
3. **Transcribes** the audio via Whisper STT (local, Groq, or OpenAI)
|
||||
4. **Processes** through the full agent pipeline (session, tools, memory)
|
||||
5. **Speaks** the reply back in the voice channel via TTS
|
||||
|
||||
### Text Channel Integration
|
||||
|
||||
When the bot is in a voice channel:
|
||||
|
||||
- Transcripts appear in the text channel: `[Voice] @user: what you said`
|
||||
- Agent responses are sent as text in the channel AND spoken in the VC
|
||||
- The text channel is the one where `/voice join` was issued
|
||||
|
||||
### Echo Prevention
|
||||
|
||||
The bot automatically pauses its audio listener while playing TTS replies, preventing it from hearing and re-processing its own output.
|
||||
|
||||
### Access Control
|
||||
|
||||
Only users listed in `DISCORD_ALLOWED_USERS` can interact via voice. Other users' audio is silently ignored.
|
||||
|
||||
```bash
|
||||
# ~/.hermes/.env
|
||||
DISCORD_ALLOWED_USERS=284102345871466496
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration Reference
|
||||
|
||||
### config.yaml
|
||||
|
||||
```yaml
|
||||
# Voice recording (CLI)
|
||||
voice:
|
||||
record_key: "ctrl+b" # Key to start/stop recording
|
||||
max_recording_seconds: 120 # Maximum recording length
|
||||
auto_tts: false # Auto-enable TTS when voice mode starts
|
||||
silence_threshold: 200 # RMS level (0-32767) below which counts as silence
|
||||
silence_duration: 3.0 # Seconds of silence before auto-stop
|
||||
|
||||
# Speech-to-Text
|
||||
stt:
|
||||
provider: "local" # "local" (free) | "groq" | "openai"
|
||||
local:
|
||||
model: "base" # tiny, base, small, medium, large-v3
|
||||
# model: "whisper-1" # Legacy: used when provider is not set
|
||||
|
||||
# Text-to-Speech
|
||||
tts:
|
||||
provider: "edge" # "edge" (free) | "elevenlabs" | "openai" | "neutts"
|
||||
edge:
|
||||
voice: "en-US-AriaNeural" # 322 voices, 74 languages
|
||||
elevenlabs:
|
||||
voice_id: "pNInz6obpgDQGcFmaJgB" # Adam
|
||||
model_id: "eleven_multilingual_v2"
|
||||
openai:
|
||||
model: "gpt-4o-mini-tts"
|
||||
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
|
||||
base_url: "https://api.openai.com/v1" # optional: override for self-hosted or OpenAI-compatible endpoints
|
||||
neutts:
|
||||
ref_audio: ''
|
||||
ref_text: ''
|
||||
model: neuphonic/neutts-air-q4-gguf
|
||||
device: cpu
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
```bash
|
||||
# Speech-to-Text providers (local needs no key)
|
||||
# pip install faster-whisper # Free local STT — no API key needed
|
||||
GROQ_API_KEY=... # Groq Whisper (fast, free tier)
|
||||
VOICE_TOOLS_OPENAI_KEY=... # OpenAI Whisper (paid)
|
||||
|
||||
# STT advanced overrides (optional)
|
||||
STT_GROQ_MODEL=whisper-large-v3-turbo # Override default Groq STT model
|
||||
STT_OPENAI_MODEL=whisper-1 # Override default OpenAI STT model
|
||||
GROQ_BASE_URL=https://api.groq.com/openai/v1 # Custom Groq endpoint
|
||||
STT_OPENAI_BASE_URL=https://api.openai.com/v1 # Custom OpenAI STT endpoint
|
||||
|
||||
# Text-to-Speech providers (Edge TTS and NeuTTS need no key)
|
||||
ELEVENLABS_API_KEY=*** # ElevenLabs (premium quality)
|
||||
# VOICE_TOOLS_OPENAI_KEY above also enables OpenAI TTS
|
||||
|
||||
# Discord voice channel
|
||||
DISCORD_BOT_TOKEN=...
|
||||
DISCORD_ALLOWED_USERS=...
|
||||
```
|
||||
|
||||
### STT Provider Comparison
|
||||
|
||||
| Provider | Model | Speed | Quality | Cost | API Key |
|
||||
|----------|-------|-------|---------|------|---------|
|
||||
| **Local** | `base` | Fast (depends on CPU/GPU) | Good | Free | No |
|
||||
| **Local** | `small` | Medium | Better | Free | No |
|
||||
| **Local** | `large-v3` | Slow | Best | Free | No |
|
||||
| **Groq** | `whisper-large-v3-turbo` | Very fast (~0.5s) | Good | Free tier | Yes |
|
||||
| **Groq** | `whisper-large-v3` | Fast (~1s) | Better | Free tier | Yes |
|
||||
| **OpenAI** | `whisper-1` | Fast (~1s) | Good | Paid | Yes |
|
||||
| **OpenAI** | `gpt-4o-transcribe` | Medium (~2s) | Best | Paid | Yes |
|
||||
|
||||
Provider priority (automatic fallback): **local** > **groq** > **openai**
|
||||
|
||||
### TTS Provider Comparison
|
||||
|
||||
| Provider | Quality | Cost | Latency | Key Required |
|
||||
|----------|---------|------|---------|-------------|
|
||||
| **Edge TTS** | Good | Free | ~1s | No |
|
||||
| **ElevenLabs** | Excellent | Paid | ~2s | Yes |
|
||||
| **OpenAI TTS** | Good | Paid | ~1.5s | Yes |
|
||||
| **NeuTTS** | Good | Free | Depends on CPU/GPU | No |
|
||||
|
||||
NeuTTS uses the `tts.neutts` config block above.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "No audio device found" (CLI)
|
||||
|
||||
PortAudio is not installed:
|
||||
|
||||
```bash
|
||||
brew install portaudio # macOS
|
||||
sudo apt install portaudio19-dev # Ubuntu
|
||||
```
|
||||
|
||||
### Bot doesn't respond in Discord server channels
|
||||
|
||||
The bot requires an @mention by default in server channels. Make sure you:
|
||||
|
||||
1. Type `@` and select the **bot user** (with the #discriminator), not the **role** with the same name
|
||||
2. Or use DMs instead — no mention needed
|
||||
3. Or set `DISCORD_REQUIRE_MENTION=false` in `~/.hermes/.env`
|
||||
|
||||
### Bot joins VC but doesn't hear me
|
||||
|
||||
- Check your Discord user ID is in `DISCORD_ALLOWED_USERS`
|
||||
- Make sure you're not muted in Discord
|
||||
- The bot needs a SPEAKING event from Discord before it can map your audio — start speaking within a few seconds of joining
|
||||
|
||||
### Bot hears me but doesn't respond
|
||||
|
||||
- Verify STT is available: install `faster-whisper` (no key needed) or set `GROQ_API_KEY` / `VOICE_TOOLS_OPENAI_KEY`
|
||||
- Check the LLM model is configured and accessible
|
||||
- Review gateway logs: `tail -f ~/.hermes/logs/gateway.log`
|
||||
|
||||
### Bot responds in text but not in voice channel
|
||||
|
||||
- TTS provider may be failing — check API key and quota
|
||||
- Edge TTS (free, no key) is the default fallback
|
||||
- Check logs for TTS errors
|
||||
|
||||
### Whisper returns garbage text
|
||||
|
||||
The hallucination filter catches most cases automatically. If you're still getting phantom transcripts:
|
||||
|
||||
- Use a quieter environment
|
||||
- Adjust `silence_threshold` in config (higher = less sensitive)
|
||||
- Try a different STT model
|
||||
173
hermes_code/website/docs/user-guide/git-worktrees.md
Normal file
173
hermes_code/website/docs/user-guide/git-worktrees.md
Normal file
|
|
@ -0,0 +1,173 @@
|
|||
---
|
||||
sidebar_position: 9
|
||||
title: "Git Worktrees"
|
||||
description: "Run multiple Hermes agents safely on the same repository using git worktrees and isolated checkouts"
|
||||
---
|
||||
|
||||
# Git Worktrees
|
||||
|
||||
Hermes Agent is often used on large, long‑lived repositories. When you want to:
|
||||
|
||||
- Run **multiple agents in parallel** on the same project, or
|
||||
- Keep experimental refactors isolated from your main branch,
|
||||
|
||||
Git **worktrees** are the safest way to give each agent its own checkout without duplicating the entire repository.
|
||||
|
||||
This page shows how to combine worktrees with Hermes so each session has a clean, isolated working directory.
|
||||
|
||||
## Why Use Worktrees with Hermes?
|
||||
|
||||
Hermes treats the **current working directory** as the project root:
|
||||
|
||||
- CLI: the directory where you run `hermes` or `hermes chat`
|
||||
- Messaging gateways: the directory set by `MESSAGING_CWD`
|
||||
|
||||
If you run multiple agents in the **same checkout**, their changes can interfere with each other:
|
||||
|
||||
- One agent may delete or rewrite files the other is using.
|
||||
- It becomes harder to understand which changes belong to which experiment.
|
||||
|
||||
With worktrees, each agent gets:
|
||||
|
||||
- Its **own branch and working directory**
|
||||
- Its **own Checkpoint Manager history** for `/rollback`
|
||||
|
||||
See also: [Checkpoints and /rollback](./checkpoints-and-rollback.md).
|
||||
|
||||
## Quick Start: Creating a Worktree
|
||||
|
||||
From your main repository (containing `.git/`), create a new worktree for a feature branch:
|
||||
|
||||
```bash
|
||||
# From the main repo root
|
||||
cd /path/to/your/repo
|
||||
|
||||
# Create a new branch and worktree in ../repo-feature
|
||||
git worktree add ../repo-feature feature/hermes-experiment
|
||||
```
|
||||
|
||||
This creates:
|
||||
|
||||
- A new directory: `../repo-feature`
|
||||
- A new branch: `feature/hermes-experiment` checked out in that directory
|
||||
|
||||
Now you can `cd` into the new worktree and run Hermes there:
|
||||
|
||||
```bash
|
||||
cd ../repo-feature
|
||||
|
||||
# Start Hermes in the worktree
|
||||
hermes
|
||||
```
|
||||
|
||||
Hermes will:
|
||||
|
||||
- See `../repo-feature` as the project root.
|
||||
- Use that directory for context files, code edits, and tools.
|
||||
- Use a **separate checkpoint history** for `/rollback` scoped to this worktree.
|
||||
|
||||
## Running Multiple Agents in Parallel
|
||||
|
||||
You can create multiple worktrees, each with its own branch:
|
||||
|
||||
```bash
|
||||
cd /path/to/your/repo
|
||||
|
||||
git worktree add ../repo-experiment-a feature/hermes-a
|
||||
git worktree add ../repo-experiment-b feature/hermes-b
|
||||
```
|
||||
|
||||
In separate terminals:
|
||||
|
||||
```bash
|
||||
# Terminal 1
|
||||
cd ../repo-experiment-a
|
||||
hermes
|
||||
|
||||
# Terminal 2
|
||||
cd ../repo-experiment-b
|
||||
hermes
|
||||
```
|
||||
|
||||
Each Hermes process:
|
||||
|
||||
- Works on its own branch (`feature/hermes-a` vs `feature/hermes-b`).
|
||||
- Writes checkpoints under a different shadow repo hash (derived from the worktree path).
|
||||
- Can use `/rollback` independently without affecting the other.
|
||||
|
||||
This is especially useful when:
|
||||
|
||||
- Running batch refactors.
|
||||
- Trying different approaches to the same task.
|
||||
- Pairing CLI + gateway sessions against the same upstream repo.
|
||||
|
||||
## Cleaning Up Worktrees Safely
|
||||
|
||||
When you are done with an experiment:
|
||||
|
||||
1. Decide whether to keep or discard the work.
|
||||
2. If you want to keep it:
|
||||
- Merge the branch into your main branch as usual.
|
||||
3. Remove the worktree:
|
||||
|
||||
```bash
|
||||
cd /path/to/your/repo
|
||||
|
||||
# Remove the worktree directory and its reference
|
||||
git worktree remove ../repo-feature
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- `git worktree remove` will refuse to remove a worktree with uncommitted changes unless you force it.
|
||||
- Removing a worktree does **not** automatically delete the branch; you can delete or keep the branch using normal `git branch` commands.
|
||||
- Hermes checkpoint data under `~/.hermes/checkpoints/` is not automatically pruned when you remove a worktree, but it is usually very small.
|
||||
|
||||
## Best Practices
|
||||
|
||||
- **One worktree per Hermes experiment**
|
||||
- Create a dedicated branch/worktree for each substantial change.
|
||||
- This keeps diffs focused and PRs small and reviewable.
|
||||
- **Name branches after the experiment**
|
||||
- e.g. `feature/hermes-checkpoints-docs`, `feature/hermes-refactor-tests`.
|
||||
- **Commit frequently**
|
||||
- Use git commits for high‑level milestones.
|
||||
- Use [checkpoints and /rollback](./checkpoints-and-rollback.md) as a safety net for tool‑driven edits in between.
|
||||
- **Avoid running Hermes from the bare repo root when using worktrees**
|
||||
- Prefer the worktree directories instead, so each agent has a clear scope.
|
||||
|
||||
## Using `hermes -w` (Automatic Worktree Mode)
|
||||
|
||||
Hermes has a built‑in `-w` flag that **automatically creates a disposable git worktree** with its own branch. You don't need to set up worktrees manually — just `cd` into your repo and run:
|
||||
|
||||
```bash
|
||||
cd /path/to/your/repo
|
||||
hermes -w
|
||||
```
|
||||
|
||||
Hermes will:
|
||||
|
||||
- Create a temporary worktree under `.worktrees/` inside your repo.
|
||||
- Check out an isolated branch (e.g. `hermes/hermes-<hash>`).
|
||||
- Run the full CLI session inside that worktree.
|
||||
|
||||
This is the easiest way to get worktree isolation. You can also combine it with a single query:
|
||||
|
||||
```bash
|
||||
hermes -w -q "Fix issue #123"
|
||||
```
|
||||
|
||||
For parallel agents, open multiple terminals and run `hermes -w` in each — every invocation gets its own worktree and branch automatically.
|
||||
|
||||
## Putting It All Together
|
||||
|
||||
- Use **git worktrees** to give each Hermes session its own clean checkout.
|
||||
- Use **branches** to capture the high‑level history of your experiments.
|
||||
- Use **checkpoints + `/rollback`** to recover from mistakes inside each worktree.
|
||||
|
||||
This combination gives you:
|
||||
|
||||
- Strong guarantees that different agents and experiments do not step on each other.
|
||||
- Fast iteration cycles with easy recovery from bad edits.
|
||||
- Clean, reviewable pull requests.
|
||||
|
||||
|
|
@ -0,0 +1,8 @@
|
|||
{
|
||||
"label": "Messaging Gateway",
|
||||
"position": 3,
|
||||
"link": {
|
||||
"type": "doc",
|
||||
"id": "user-guide/messaging/index"
|
||||
}
|
||||
}
|
||||
192
hermes_code/website/docs/user-guide/messaging/dingtalk.md
Normal file
192
hermes_code/website/docs/user-guide/messaging/dingtalk.md
Normal file
|
|
@ -0,0 +1,192 @@
|
|||
---
|
||||
sidebar_position: 10
|
||||
title: "DingTalk"
|
||||
description: "Set up Hermes Agent as a DingTalk chatbot"
|
||||
---
|
||||
|
||||
# DingTalk Setup
|
||||
|
||||
Hermes Agent integrates with DingTalk (钉钉) as a chatbot, letting you chat with your AI assistant through direct messages or group chats. The bot connects via DingTalk's Stream Mode — a long-lived WebSocket connection that requires no public URL or webhook server — and replies using markdown-formatted messages through DingTalk's session webhook API.
|
||||
|
||||
Before setup, here's the part most people want to know: how Hermes behaves once it's in your DingTalk workspace.
|
||||
|
||||
## How Hermes Behaves
|
||||
|
||||
| Context | Behavior |
|
||||
|---------|----------|
|
||||
| **DMs (1:1 chat)** | Hermes responds to every message. No `@mention` needed. Each DM has its own session. |
|
||||
| **Group chats** | Hermes responds when you `@mention` it. Without a mention, Hermes ignores the message. |
|
||||
| **Shared groups with multiple users** | By default, Hermes isolates session history per user inside the group. Two people talking in the same group do not share one transcript unless you explicitly disable that. |
|
||||
|
||||
### Session Model in DingTalk
|
||||
|
||||
By default:
|
||||
|
||||
- each DM gets its own session
|
||||
- each user in a shared group chat gets their own session inside that group
|
||||
|
||||
This is controlled by `config.yaml`:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: true
|
||||
```
|
||||
|
||||
Set it to `false` only if you explicitly want one shared conversation for the entire group:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: false
|
||||
```
|
||||
|
||||
This guide walks you through the full setup process — from creating your DingTalk bot to sending your first message.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Install the required Python packages:
|
||||
|
||||
```bash
|
||||
pip install dingtalk-stream httpx
|
||||
```
|
||||
|
||||
- `dingtalk-stream` — DingTalk's official SDK for Stream Mode (WebSocket-based real-time messaging)
|
||||
- `httpx` — async HTTP client used for sending replies via session webhooks
|
||||
|
||||
## Step 1: Create a DingTalk App
|
||||
|
||||
1. Go to the [DingTalk Developer Console](https://open-dev.dingtalk.com/).
|
||||
2. Log in with your DingTalk admin account.
|
||||
3. Click **Application Development** → **Custom Apps** → **Create App via H5 Micro-App** (or **Robot** depending on your console version).
|
||||
4. Fill in:
|
||||
- **App Name**: e.g., `Hermes Agent`
|
||||
- **Description**: optional
|
||||
5. After creating, navigate to **Credentials & Basic Info** to find your **Client ID** (AppKey) and **Client Secret** (AppSecret). Copy both.
|
||||
|
||||
:::warning[Credentials shown only once]
|
||||
The Client Secret is only displayed once when you create the app. If you lose it, you'll need to regenerate it. Never share these credentials publicly or commit them to Git.
|
||||
:::
|
||||
|
||||
## Step 2: Enable the Robot Capability
|
||||
|
||||
1. In your app's settings page, go to **Add Capability** → **Robot**.
|
||||
2. Enable the robot capability.
|
||||
3. Under **Message Reception Mode**, select **Stream Mode** (recommended — no public URL needed).
|
||||
|
||||
:::tip
|
||||
Stream Mode is the recommended setup. It uses a long-lived WebSocket connection initiated from your machine, so you don't need a public IP, domain name, or webhook endpoint. This works behind NAT, firewalls, and on local machines.
|
||||
:::
|
||||
|
||||
## Step 3: Find Your DingTalk User ID
|
||||
|
||||
Hermes Agent uses your DingTalk User ID to control who can interact with the bot. DingTalk User IDs are alphanumeric strings set by your organization's admin.
|
||||
|
||||
To find yours:
|
||||
|
||||
1. Ask your DingTalk organization admin — User IDs are configured in the DingTalk admin console under **Contacts** → **Members**.
|
||||
2. Alternatively, the bot logs the `sender_id` for each incoming message. Start the gateway, send the bot a message, then check the logs for your ID.
|
||||
|
||||
## Step 4: Configure Hermes Agent
|
||||
|
||||
### Option A: Interactive Setup (Recommended)
|
||||
|
||||
Run the guided setup command:
|
||||
|
||||
```bash
|
||||
hermes gateway setup
|
||||
```
|
||||
|
||||
Select **DingTalk** when prompted, then paste your Client ID, Client Secret, and allowed user IDs when asked.
|
||||
|
||||
### Option B: Manual Configuration
|
||||
|
||||
Add the following to your `~/.hermes/.env` file:
|
||||
|
||||
```bash
|
||||
# Required
|
||||
DINGTALK_CLIENT_ID=your-app-key
|
||||
DINGTALK_CLIENT_SECRET=your-app-secret
|
||||
|
||||
# Security: restrict who can interact with the bot
|
||||
DINGTALK_ALLOWED_USERS=user-id-1
|
||||
|
||||
# Multiple allowed users (comma-separated)
|
||||
# DINGTALK_ALLOWED_USERS=user-id-1,user-id-2
|
||||
```
|
||||
|
||||
Optional behavior settings in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: true
|
||||
```
|
||||
|
||||
- `group_sessions_per_user: true` keeps each participant's context isolated inside shared group chats
|
||||
|
||||
### Start the Gateway
|
||||
|
||||
Once configured, start the DingTalk gateway:
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
The bot should connect to DingTalk's Stream Mode within a few seconds. Send it a message — either a DM or in a group where it's been added — to test.
|
||||
|
||||
:::tip
|
||||
You can run `hermes gateway` in the background or as a systemd service for persistent operation. See the deployment docs for details.
|
||||
:::
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Bot is not responding to messages
|
||||
|
||||
**Cause**: The robot capability isn't enabled, or `DINGTALK_ALLOWED_USERS` doesn't include your User ID.
|
||||
|
||||
**Fix**: Verify the robot capability is enabled in your app settings and that Stream Mode is selected. Check that your User ID is in `DINGTALK_ALLOWED_USERS`. Restart the gateway.
|
||||
|
||||
### "dingtalk-stream not installed" error
|
||||
|
||||
**Cause**: The `dingtalk-stream` Python package is not installed.
|
||||
|
||||
**Fix**: Install it:
|
||||
|
||||
```bash
|
||||
pip install dingtalk-stream httpx
|
||||
```
|
||||
|
||||
### "DINGTALK_CLIENT_ID and DINGTALK_CLIENT_SECRET required"
|
||||
|
||||
**Cause**: The credentials aren't set in your environment or `.env` file.
|
||||
|
||||
**Fix**: Verify `DINGTALK_CLIENT_ID` and `DINGTALK_CLIENT_SECRET` are set correctly in `~/.hermes/.env`. The Client ID is your AppKey, and the Client Secret is your AppSecret from the DingTalk Developer Console.
|
||||
|
||||
### Stream disconnects / reconnection loops
|
||||
|
||||
**Cause**: Network instability, DingTalk platform maintenance, or credential issues.
|
||||
|
||||
**Fix**: The adapter automatically reconnects with exponential backoff (2s → 5s → 10s → 30s → 60s). Check that your credentials are valid and your app hasn't been deactivated. Verify your network allows outbound WebSocket connections.
|
||||
|
||||
### Bot is offline
|
||||
|
||||
**Cause**: The Hermes gateway isn't running, or it failed to connect.
|
||||
|
||||
**Fix**: Check that `hermes gateway` is running. Look at the terminal output for error messages. Common issues: wrong credentials, app deactivated, `dingtalk-stream` or `httpx` not installed.
|
||||
|
||||
### "No session_webhook available"
|
||||
|
||||
**Cause**: The bot tried to reply but doesn't have a session webhook URL. This typically happens if the webhook expired or the bot was restarted between receiving the message and sending the reply.
|
||||
|
||||
**Fix**: Send a new message to the bot — each incoming message provides a fresh session webhook for replies. This is a normal DingTalk limitation; the bot can only reply to messages it has received recently.
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
Always set `DINGTALK_ALLOWED_USERS` to restrict who can interact with the bot. Without it, the gateway denies all users by default as a safety measure. Only add User IDs of people you trust — authorized users have full access to the agent's capabilities, including tool use and system access.
|
||||
:::
|
||||
|
||||
For more information on securing your Hermes Agent deployment, see the [Security Guide](../security.md).
|
||||
|
||||
## Notes
|
||||
|
||||
- **Stream Mode**: No public URL, domain name, or webhook server needed. The connection is initiated from your machine via WebSocket, so it works behind NAT and firewalls.
|
||||
- **Markdown responses**: Replies are formatted in DingTalk's markdown format for rich text display.
|
||||
- **Message deduplication**: The adapter deduplicates messages with a 5-minute window to prevent processing the same message twice.
|
||||
- **Auto-reconnection**: If the stream connection drops, the adapter automatically reconnects with exponential backoff.
|
||||
- **Message length limit**: Responses are capped at 20,000 characters per message. Longer responses are truncated.
|
||||
363
hermes_code/website/docs/user-guide/messaging/discord.md
Normal file
363
hermes_code/website/docs/user-guide/messaging/discord.md
Normal file
|
|
@ -0,0 +1,363 @@
|
|||
---
|
||||
sidebar_position: 3
|
||||
title: "Discord"
|
||||
description: "Set up Hermes Agent as a Discord bot"
|
||||
---
|
||||
|
||||
# Discord Setup
|
||||
|
||||
Hermes Agent integrates with Discord as a bot, letting you chat with your AI assistant through direct messages or server channels. The bot receives your messages, processes them through the Hermes Agent pipeline (including tool use, memory, and reasoning), and responds in real time. It supports text, voice messages, file attachments, and slash commands.
|
||||
|
||||
Before setup, here's the part most people want to know: how Hermes behaves once it's in your server.
|
||||
|
||||
## How Hermes Behaves
|
||||
|
||||
| Context | Behavior |
|
||||
|---------|----------|
|
||||
| **DMs** | Hermes responds to every message. No `@mention` needed. Each DM has its own session. |
|
||||
| **Server channels** | By default, Hermes only responds when you `@mention` it. If you post in a channel without mentioning it, Hermes ignores the message. |
|
||||
| **Free-response channels** | You can make specific channels mention-free with `DISCORD_FREE_RESPONSE_CHANNELS`, or disable mentions globally with `DISCORD_REQUIRE_MENTION=false`. |
|
||||
| **Threads** | Hermes replies in the same thread. Mention rules still apply unless that thread or its parent channel is configured as free-response. Threads stay isolated from the parent channel for session history. |
|
||||
| **Shared channels with multiple users** | By default, Hermes isolates session history per user inside the channel for safety and clarity. Two people talking in the same channel do not share one transcript unless you explicitly disable that. |
|
||||
|
||||
:::tip
|
||||
If you want a normal bot-help channel where people can talk to Hermes without tagging it every time, add that channel to `DISCORD_FREE_RESPONSE_CHANNELS`.
|
||||
:::
|
||||
|
||||
### Discord Gateway Model
|
||||
|
||||
Hermes on Discord is not a webhook that replies statelessly. It runs through the full messaging gateway, which means each incoming message goes through:
|
||||
|
||||
1. authorization (`DISCORD_ALLOWED_USERS`)
|
||||
2. mention / free-response checks
|
||||
3. session lookup
|
||||
4. session transcript loading
|
||||
5. normal Hermes agent execution, including tools, memory, and slash commands
|
||||
6. response delivery back to Discord
|
||||
|
||||
That matters because behavior in a busy server depends on both Discord routing and Hermes session policy.
|
||||
|
||||
### Session Model in Discord
|
||||
|
||||
By default:
|
||||
|
||||
- each DM gets its own session
|
||||
- each server thread gets its own session namespace
|
||||
- each user in a shared channel gets their own session inside that channel
|
||||
|
||||
So if Alice and Bob both talk to Hermes in `#research`, Hermes treats those as separate conversations by default even though they are using the same visible Discord channel.
|
||||
|
||||
This is controlled by `config.yaml`:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: true
|
||||
```
|
||||
|
||||
Set it to `false` only if you explicitly want one shared conversation for the entire room:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: false
|
||||
```
|
||||
|
||||
Shared sessions can be useful for a collaborative room, but they also mean:
|
||||
|
||||
- users share context growth and token costs
|
||||
- one person's long tool-heavy task can bloat everyone else's context
|
||||
- one person's in-flight run can interrupt another person's follow-up in the same room
|
||||
|
||||
### Interrupts and Concurrency
|
||||
|
||||
Hermes tracks running agents by session key.
|
||||
|
||||
With the default `group_sessions_per_user: true`:
|
||||
|
||||
- Alice interrupting her own in-flight request only affects Alice's session in that channel
|
||||
- Bob can keep talking in the same channel without inheriting Alice's history or interrupting Alice's run
|
||||
|
||||
With `group_sessions_per_user: false`:
|
||||
|
||||
- the whole room shares one running-agent slot for that channel/thread
|
||||
- follow-up messages from different people can interrupt or queue behind each other
|
||||
|
||||
This guide walks you through the full setup process — from creating your bot on Discord's Developer Portal to sending your first message.
|
||||
|
||||
## Step 1: Create a Discord Application
|
||||
|
||||
1. Go to the [Discord Developer Portal](https://discord.com/developers/applications) and sign in with your Discord account.
|
||||
2. Click **New Application** in the top-right corner.
|
||||
3. Enter a name for your application (e.g., "Hermes Agent") and accept the Developer Terms of Service.
|
||||
4. Click **Create**.
|
||||
|
||||
You'll land on the **General Information** page. Note the **Application ID** — you'll need it later to build the invite URL.
|
||||
|
||||
## Step 2: Create the Bot
|
||||
|
||||
1. In the left sidebar, click **Bot**.
|
||||
2. Discord automatically creates a bot user for your application. You'll see the bot's username, which you can customize.
|
||||
3. Under **Authorization Flow**:
|
||||
- Set **Public Bot** to **OFF** — this prevents other people from inviting your bot to their servers.
|
||||
- Leave **Require OAuth2 Code Grant** set to **OFF**.
|
||||
|
||||
:::tip
|
||||
You can set a custom avatar and banner for your bot on this page. This is what users will see in Discord.
|
||||
:::
|
||||
|
||||
## Step 3: Enable Privileged Gateway Intents
|
||||
|
||||
This is the most critical step in the entire setup. Without the correct intents enabled, your bot will connect to Discord but **will not be able to read message content**.
|
||||
|
||||
On the **Bot** page, scroll down to **Privileged Gateway Intents**. You'll see three toggles:
|
||||
|
||||
| Intent | Purpose | Required? |
|
||||
|--------|---------|-----------|
|
||||
| **Presence Intent** | See user online/offline status | Optional |
|
||||
| **Server Members Intent** | Access the member list, resolve usernames | **Required** |
|
||||
| **Message Content Intent** | Read the text content of messages | **Required** |
|
||||
|
||||
**Enable both Server Members Intent and Message Content Intent** by toggling them **ON**.
|
||||
|
||||
- Without **Message Content Intent**, your bot receives message events but the message text is empty — the bot literally cannot see what you typed.
|
||||
- Without **Server Members Intent**, the bot cannot resolve usernames for the allowed users list and may fail to identify who is messaging it.
|
||||
|
||||
:::warning[This is the #1 reason Discord bots don't work]
|
||||
If your bot is online but never responds to messages, the **Message Content Intent** is almost certainly disabled. Go back to the [Developer Portal](https://discord.com/developers/applications), select your application → Bot → Privileged Gateway Intents, and make sure **Message Content Intent** is toggled ON. Click **Save Changes**.
|
||||
:::
|
||||
|
||||
**Regarding server count:**
|
||||
- If your bot is in **fewer than 100 servers**, you can simply toggle intents on and off freely.
|
||||
- If your bot is in **100 or more servers**, Discord requires you to submit a verification application to use privileged intents. For personal use, this is not a concern.
|
||||
|
||||
Click **Save Changes** at the bottom of the page.
|
||||
|
||||
## Step 4: Get the Bot Token
|
||||
|
||||
The bot token is the credential Hermes Agent uses to log in as your bot. Still on the **Bot** page:
|
||||
|
||||
1. Under the **Token** section, click **Reset Token**.
|
||||
2. If you have two-factor authentication enabled on your Discord account, enter your 2FA code.
|
||||
3. Discord will display your new token. **Copy it immediately.**
|
||||
|
||||
:::warning[Token shown only once]
|
||||
The token is only displayed once. If you lose it, you'll need to reset it and generate a new one. Never share your token publicly or commit it to Git — anyone with this token has full control of your bot.
|
||||
:::
|
||||
|
||||
Store the token somewhere safe (a password manager, for example). You'll need it in Step 8.
|
||||
|
||||
## Step 5: Generate the Invite URL
|
||||
|
||||
You need an OAuth2 URL to invite the bot to your server. There are two ways to do this:
|
||||
|
||||
### Option A: Using the Installation Tab (Recommended)
|
||||
|
||||
1. In the left sidebar, click **Installation**.
|
||||
2. Under **Installation Contexts**, enable **Guild Install**.
|
||||
3. For **Install Link**, select **Discord Provided Link**.
|
||||
4. Under **Default Install Settings** for Guild Install:
|
||||
- **Scopes**: select `bot` and `applications.commands`
|
||||
- **Permissions**: select the permissions listed below.
|
||||
|
||||
### Option B: Manual URL
|
||||
|
||||
You can construct the invite URL directly using this format:
|
||||
|
||||
```
|
||||
https://discord.com/oauth2/authorize?client_id=YOUR_APP_ID&scope=bot+applications.commands&permissions=274878286912
|
||||
```
|
||||
|
||||
Replace `YOUR_APP_ID` with the Application ID from Step 1.
|
||||
|
||||
### Required Permissions
|
||||
|
||||
These are the minimum permissions your bot needs:
|
||||
|
||||
- **View Channels** — see the channels it has access to
|
||||
- **Send Messages** — respond to your messages
|
||||
- **Embed Links** — format rich responses
|
||||
- **Attach Files** — send images, audio, and file outputs
|
||||
- **Read Message History** — maintain conversation context
|
||||
|
||||
### Recommended Additional Permissions
|
||||
|
||||
- **Send Messages in Threads** — respond in thread conversations
|
||||
- **Add Reactions** — react to messages for acknowledgment
|
||||
|
||||
### Permission Integers
|
||||
|
||||
| Level | Permissions Integer | What's Included |
|
||||
|-------|-------------------|-----------------|
|
||||
| Minimal | `117760` | View Channels, Send Messages, Read Message History, Attach Files |
|
||||
| Recommended | `274878286912` | All of the above plus Embed Links, Send Messages in Threads, Add Reactions |
|
||||
|
||||
## Step 6: Invite to Your Server
|
||||
|
||||
1. Open the invite URL in your browser (from the Installation tab or the manual URL you constructed).
|
||||
2. In the **Add to Server** dropdown, select your server.
|
||||
3. Click **Continue**, then **Authorize**.
|
||||
4. Complete the CAPTCHA if prompted.
|
||||
|
||||
:::info
|
||||
You need the **Manage Server** permission on the Discord server to invite a bot. If you don't see your server in the dropdown, ask a server admin to use the invite link instead.
|
||||
:::
|
||||
|
||||
After authorizing, the bot will appear in your server's member list (it will show as offline until you start the Hermes gateway).
|
||||
|
||||
## Step 7: Find Your Discord User ID
|
||||
|
||||
Hermes Agent uses your Discord User ID to control who can interact with the bot. To find it:
|
||||
|
||||
1. Open Discord (desktop or web app).
|
||||
2. Go to **Settings** → **Advanced** → toggle **Developer Mode** to **ON**.
|
||||
3. Close settings.
|
||||
4. Right-click your own username (in a message, the member list, or your profile) → **Copy User ID**.
|
||||
|
||||
Your User ID is a long number like `284102345871466496`.
|
||||
|
||||
:::tip
|
||||
Developer Mode also lets you copy **Channel IDs** and **Server IDs** the same way — right-click the channel or server name and select Copy ID. You'll need a Channel ID if you want to set a home channel manually.
|
||||
:::
|
||||
|
||||
## Step 8: Configure Hermes Agent
|
||||
|
||||
### Option A: Interactive Setup (Recommended)
|
||||
|
||||
Run the guided setup command:
|
||||
|
||||
```bash
|
||||
hermes gateway setup
|
||||
```
|
||||
|
||||
Select **Discord** when prompted, then paste your bot token and user ID when asked.
|
||||
|
||||
### Option B: Manual Configuration
|
||||
|
||||
Add the following to your `~/.hermes/.env` file:
|
||||
|
||||
```bash
|
||||
# Required
|
||||
DISCORD_BOT_TOKEN=your-bot-token
|
||||
DISCORD_ALLOWED_USERS=284102345871466496
|
||||
|
||||
# Multiple allowed users (comma-separated)
|
||||
# DISCORD_ALLOWED_USERS=284102345871466496,198765432109876543
|
||||
|
||||
# Optional: respond without @mention (default: true = require mention)
|
||||
# DISCORD_REQUIRE_MENTION=false
|
||||
|
||||
# Optional: channels where bot responds without @mention (comma-separated channel IDs)
|
||||
# DISCORD_FREE_RESPONSE_CHANNELS=1234567890,9876543210
|
||||
```
|
||||
|
||||
Optional behavior settings in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
discord:
|
||||
require_mention: true
|
||||
|
||||
group_sessions_per_user: true
|
||||
```
|
||||
|
||||
- `discord.require_mention: true` keeps Hermes quiet in normal server traffic unless mentioned
|
||||
- `group_sessions_per_user: true` keeps each participant's context isolated inside shared channels and threads
|
||||
|
||||
### Start the Gateway
|
||||
|
||||
Once configured, start the Discord gateway:
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
The bot should come online in Discord within a few seconds. Send it a message — either a DM or in a channel it can see — to test.
|
||||
|
||||
:::tip
|
||||
You can run `hermes gateway` in the background or as a systemd service for persistent operation. See the deployment docs for details.
|
||||
:::
|
||||
|
||||
## Home Channel
|
||||
|
||||
You can designate a "home channel" where the bot sends proactive messages (such as cron job output, reminders, and notifications). There are two ways to set it:
|
||||
|
||||
### Using the Slash Command
|
||||
|
||||
Type `/sethome` in any Discord channel where the bot is present. That channel becomes the home channel.
|
||||
|
||||
### Manual Configuration
|
||||
|
||||
Add these to your `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
DISCORD_HOME_CHANNEL=123456789012345678
|
||||
DISCORD_HOME_CHANNEL_NAME="#bot-updates"
|
||||
```
|
||||
|
||||
Replace the ID with the actual channel ID (right-click → Copy Channel ID with Developer Mode on).
|
||||
|
||||
## Voice Messages
|
||||
|
||||
Hermes Agent supports Discord voice messages:
|
||||
|
||||
- **Incoming voice messages** are automatically transcribed using the configured STT provider: local `faster-whisper` (no key), Groq Whisper (`GROQ_API_KEY`), or OpenAI Whisper (`VOICE_TOOLS_OPENAI_KEY`).
|
||||
- **Text-to-speech**: Use `/voice tts` to have the bot send spoken audio responses alongside text replies.
|
||||
- **Discord voice channels**: Hermes can also join a voice channel, listen to users speaking, and talk back in the channel.
|
||||
|
||||
For the full setup and operational guide, see:
|
||||
- [Voice Mode](/docs/user-guide/features/voice-mode)
|
||||
- [Use Voice Mode with Hermes](/docs/guides/use-voice-mode-with-hermes)
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Bot is online but not responding to messages
|
||||
|
||||
**Cause**: Message Content Intent is disabled.
|
||||
|
||||
**Fix**: Go to [Developer Portal](https://discord.com/developers/applications) → your app → Bot → Privileged Gateway Intents → enable **Message Content Intent** → Save Changes. Restart the gateway.
|
||||
|
||||
### "Disallowed Intents" error on startup
|
||||
|
||||
**Cause**: Your code requests intents that aren't enabled in the Developer Portal.
|
||||
|
||||
**Fix**: Enable all three Privileged Gateway Intents (Presence, Server Members, Message Content) in the Bot settings, then restart.
|
||||
|
||||
### Bot can't see messages in a specific channel
|
||||
|
||||
**Cause**: The bot's role doesn't have permission to view that channel.
|
||||
|
||||
**Fix**: In Discord, go to the channel's settings → Permissions → add the bot's role with **View Channel** and **Read Message History** enabled.
|
||||
|
||||
### 403 Forbidden errors
|
||||
|
||||
**Cause**: The bot is missing required permissions.
|
||||
|
||||
**Fix**: Re-invite the bot with the correct permissions using the URL from Step 5, or manually adjust the bot's role permissions in Server Settings → Roles.
|
||||
|
||||
### Bot is offline
|
||||
|
||||
**Cause**: The Hermes gateway isn't running, or the token is incorrect.
|
||||
|
||||
**Fix**: Check that `hermes gateway` is running. Verify `DISCORD_BOT_TOKEN` in your `.env` file. If you recently reset the token, update it.
|
||||
|
||||
### "User not allowed" / Bot ignores you
|
||||
|
||||
**Cause**: Your User ID isn't in `DISCORD_ALLOWED_USERS`.
|
||||
|
||||
**Fix**: Add your User ID to `DISCORD_ALLOWED_USERS` in `~/.hermes/.env` and restart the gateway.
|
||||
|
||||
### People in the same channel are sharing context unexpectedly
|
||||
|
||||
**Cause**: `group_sessions_per_user` is disabled, or the platform cannot provide a user ID for the messages in that context.
|
||||
|
||||
**Fix**: Set this in `~/.hermes/config.yaml` and restart the gateway:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: true
|
||||
```
|
||||
|
||||
If you intentionally want a shared room conversation, leave it off — just expect shared transcript history and shared interrupt behavior.
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
Always set `DISCORD_ALLOWED_USERS` to restrict who can interact with the bot. Without it, the gateway denies all users by default as a safety measure. Only add User IDs of people you trust — authorized users have full access to the agent's capabilities, including tool use and system access.
|
||||
:::
|
||||
|
||||
For more information on securing your Hermes Agent deployment, see the [Security Guide](../security.md).
|
||||
189
hermes_code/website/docs/user-guide/messaging/email.md
Normal file
189
hermes_code/website/docs/user-guide/messaging/email.md
Normal file
|
|
@ -0,0 +1,189 @@
|
|||
---
|
||||
sidebar_position: 7
|
||||
title: "Email"
|
||||
description: "Set up Hermes Agent as an email assistant via IMAP/SMTP"
|
||||
---
|
||||
|
||||
# Email Setup
|
||||
|
||||
Hermes can receive and reply to emails using standard IMAP and SMTP protocols. Send an email to the agent's address and it replies in-thread — no special client or bot API needed. Works with Gmail, Outlook, Yahoo, Fastmail, or any provider that supports IMAP/SMTP.
|
||||
|
||||
:::info No External Dependencies
|
||||
The Email adapter uses Python's built-in `imaplib`, `smtplib`, and `email` modules. No additional packages or external services are required.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **A dedicated email account** for your Hermes agent (don't use your personal email)
|
||||
- **IMAP enabled** on the email account
|
||||
- **An app password** if using Gmail or another provider with 2FA
|
||||
|
||||
### Gmail Setup
|
||||
|
||||
1. Enable 2-Factor Authentication on your Google Account
|
||||
2. Go to [App Passwords](https://myaccount.google.com/apppasswords)
|
||||
3. Create a new App Password (select "Mail" or "Other")
|
||||
4. Copy the 16-character password — you'll use this instead of your regular password
|
||||
|
||||
### Outlook / Microsoft 365
|
||||
|
||||
1. Go to [Security Settings](https://account.microsoft.com/security)
|
||||
2. Enable 2FA if not already active
|
||||
3. Create an App Password under "Additional security options"
|
||||
4. IMAP host: `outlook.office365.com`, SMTP host: `smtp.office365.com`
|
||||
|
||||
### Other Providers
|
||||
|
||||
Most email providers support IMAP/SMTP. Check your provider's documentation for:
|
||||
- IMAP host and port (usually port 993 with SSL)
|
||||
- SMTP host and port (usually port 587 with STARTTLS)
|
||||
- Whether app passwords are required
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Configure Hermes
|
||||
|
||||
The easiest way:
|
||||
|
||||
```bash
|
||||
hermes gateway setup
|
||||
```
|
||||
|
||||
Select **Email** from the platform menu. The wizard prompts for your email address, password, IMAP/SMTP hosts, and allowed senders.
|
||||
|
||||
### Manual Configuration
|
||||
|
||||
Add to `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
# Required
|
||||
EMAIL_ADDRESS=hermes@gmail.com
|
||||
EMAIL_PASSWORD=abcd efgh ijkl mnop # App password (not your regular password)
|
||||
EMAIL_IMAP_HOST=imap.gmail.com
|
||||
EMAIL_SMTP_HOST=smtp.gmail.com
|
||||
|
||||
# Security (recommended)
|
||||
EMAIL_ALLOWED_USERS=your@email.com,colleague@work.com
|
||||
|
||||
# Optional
|
||||
EMAIL_IMAP_PORT=993 # Default: 993 (IMAP SSL)
|
||||
EMAIL_SMTP_PORT=587 # Default: 587 (SMTP STARTTLS)
|
||||
EMAIL_POLL_INTERVAL=15 # Seconds between inbox checks (default: 15)
|
||||
EMAIL_HOME_ADDRESS=your@email.com # Default delivery target for cron jobs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Start the Gateway
|
||||
|
||||
```bash
|
||||
hermes gateway # Run in foreground
|
||||
hermes gateway install # Install as a user service
|
||||
sudo hermes gateway install --system # Linux only: boot-time system service
|
||||
```
|
||||
|
||||
On startup, the adapter:
|
||||
1. Tests IMAP and SMTP connections
|
||||
2. Marks all existing inbox messages as "seen" (only processes new emails)
|
||||
3. Starts polling for new messages
|
||||
|
||||
---
|
||||
|
||||
## How It Works
|
||||
|
||||
### Receiving Messages
|
||||
|
||||
The adapter polls the IMAP inbox for UNSEEN messages at a configurable interval (default: 15 seconds). For each new email:
|
||||
|
||||
- **Subject line** is included as context (e.g., `[Subject: Deploy to production]`)
|
||||
- **Reply emails** (subject starting with `Re:`) skip the subject prefix — the thread context is already established
|
||||
- **Attachments** are cached locally:
|
||||
- Images (JPEG, PNG, GIF, WebP) → available to the vision tool
|
||||
- Documents (PDF, ZIP, etc.) → available for file access
|
||||
- **HTML-only emails** have tags stripped for plain text extraction
|
||||
- **Self-messages** are filtered out to prevent reply loops
|
||||
|
||||
### Sending Replies
|
||||
|
||||
Replies are sent via SMTP with proper email threading:
|
||||
|
||||
- **In-Reply-To** and **References** headers maintain the thread
|
||||
- **Subject line** preserved with `Re:` prefix (no double `Re: Re:`)
|
||||
- **Message-ID** generated with the agent's domain
|
||||
- Responses are sent as plain text (UTF-8)
|
||||
|
||||
### File Attachments
|
||||
|
||||
The agent can send file attachments in replies. Include `MEDIA:/path/to/file` in the response and the file is attached to the outgoing email.
|
||||
|
||||
### Skipping Attachments
|
||||
|
||||
To ignore all incoming attachments (for malware protection or bandwidth savings), add to your `config.yaml`:
|
||||
|
||||
```yaml
|
||||
platforms:
|
||||
email:
|
||||
skip_attachments: true
|
||||
```
|
||||
|
||||
When enabled, attachment and inline parts are skipped before payload decoding. The email body text is still processed normally.
|
||||
|
||||
---
|
||||
|
||||
## Access Control
|
||||
|
||||
Email access follows the same pattern as all other Hermes platforms:
|
||||
|
||||
1. **`EMAIL_ALLOWED_USERS` set** → only emails from those addresses are processed
|
||||
2. **No allowlist set** → unknown senders get a pairing code
|
||||
3. **`EMAIL_ALLOW_ALL_USERS=true`** → any sender is accepted (use with caution)
|
||||
|
||||
:::warning
|
||||
**Always configure `EMAIL_ALLOWED_USERS`.** Without it, anyone who knows the agent's email address could send commands. The agent has terminal access by default.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Solution |
|
||||
|---------|----------|
|
||||
| **"IMAP connection failed"** at startup | Verify `EMAIL_IMAP_HOST` and `EMAIL_IMAP_PORT`. Ensure IMAP is enabled on the account. For Gmail, enable it in Settings → Forwarding and POP/IMAP. |
|
||||
| **"SMTP connection failed"** at startup | Verify `EMAIL_SMTP_HOST` and `EMAIL_SMTP_PORT`. Check that your password is correct (use App Password for Gmail). |
|
||||
| **Messages not received** | Check `EMAIL_ALLOWED_USERS` includes the sender's email. Check spam folder — some providers flag automated replies. |
|
||||
| **"Authentication failed"** | For Gmail, you must use an App Password, not your regular password. Ensure 2FA is enabled first. |
|
||||
| **Duplicate replies** | Ensure only one gateway instance is running. Check `hermes gateway status`. |
|
||||
| **Slow response** | The default poll interval is 15 seconds. Reduce with `EMAIL_POLL_INTERVAL=5` for faster response (but more IMAP connections). |
|
||||
| **Replies not threading** | The adapter uses In-Reply-To headers. Some email clients (especially web-based) may not thread correctly with automated messages. |
|
||||
|
||||
---
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
**Use a dedicated email account.** Don't use your personal email — the agent stores the password in `.env` and has full inbox access via IMAP.
|
||||
:::
|
||||
|
||||
- Use **App Passwords** instead of your main password (required for Gmail with 2FA)
|
||||
- Set `EMAIL_ALLOWED_USERS` to restrict who can interact with the agent
|
||||
- The password is stored in `~/.hermes/.env` — protect this file (`chmod 600`)
|
||||
- IMAP uses SSL (port 993) and SMTP uses STARTTLS (port 587) by default — connections are encrypted
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables Reference
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|----------|----------|---------|-------------|
|
||||
| `EMAIL_ADDRESS` | Yes | — | Agent's email address |
|
||||
| `EMAIL_PASSWORD` | Yes | — | Email password or app password |
|
||||
| `EMAIL_IMAP_HOST` | Yes | — | IMAP server host (e.g., `imap.gmail.com`) |
|
||||
| `EMAIL_SMTP_HOST` | Yes | — | SMTP server host (e.g., `smtp.gmail.com`) |
|
||||
| `EMAIL_IMAP_PORT` | No | `993` | IMAP server port |
|
||||
| `EMAIL_SMTP_PORT` | No | `587` | SMTP server port |
|
||||
| `EMAIL_POLL_INTERVAL` | No | `15` | Seconds between inbox checks |
|
||||
| `EMAIL_ALLOWED_USERS` | No | — | Comma-separated allowed sender addresses |
|
||||
| `EMAIL_HOME_ADDRESS` | No | — | Default delivery target for cron jobs |
|
||||
| `EMAIL_ALLOW_ALL_USERS` | No | `false` | Allow all senders (not recommended) |
|
||||
249
hermes_code/website/docs/user-guide/messaging/homeassistant.md
Normal file
249
hermes_code/website/docs/user-guide/messaging/homeassistant.md
Normal file
|
|
@ -0,0 +1,249 @@
|
|||
---
|
||||
title: Home Assistant
|
||||
description: Control your smart home with Hermes Agent via Home Assistant integration.
|
||||
sidebar_label: Home Assistant
|
||||
sidebar_position: 5
|
||||
---
|
||||
|
||||
# Home Assistant Integration
|
||||
|
||||
Hermes Agent integrates with [Home Assistant](https://www.home-assistant.io/) in two ways:
|
||||
|
||||
1. **Gateway platform** — subscribes to real-time state changes via WebSocket and responds to events
|
||||
2. **Smart home tools** — four LLM-callable tools for querying and controlling devices via the REST API
|
||||
|
||||
## Setup
|
||||
|
||||
### 1. Create a Long-Lived Access Token
|
||||
|
||||
1. Open your Home Assistant instance
|
||||
2. Go to your **Profile** (click your name in the sidebar)
|
||||
3. Scroll to **Long-Lived Access Tokens**
|
||||
4. Click **Create Token**, give it a name like "Hermes Agent"
|
||||
5. Copy the token
|
||||
|
||||
### 2. Configure Environment Variables
|
||||
|
||||
```bash
|
||||
# Add to ~/.hermes/.env
|
||||
|
||||
# Required: your Long-Lived Access Token
|
||||
HASS_TOKEN=your-long-lived-access-token
|
||||
|
||||
# Optional: HA URL (default: http://homeassistant.local:8123)
|
||||
HASS_URL=http://192.168.1.100:8123
|
||||
```
|
||||
|
||||
:::info
|
||||
The `homeassistant` toolset is automatically enabled when `HASS_TOKEN` is set. Both the gateway platform and the device control tools activate from this single token.
|
||||
:::
|
||||
|
||||
### 3. Start the Gateway
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
Home Assistant will appear as a connected platform alongside any other messaging platforms (Telegram, Discord, etc.).
|
||||
|
||||
## Available Tools
|
||||
|
||||
Hermes Agent registers four tools for smart home control:
|
||||
|
||||
### `ha_list_entities`
|
||||
|
||||
List Home Assistant entities, optionally filtered by domain or area.
|
||||
|
||||
**Parameters:**
|
||||
- `domain` *(optional)* — Filter by entity domain: `light`, `switch`, `climate`, `sensor`, `binary_sensor`, `cover`, `fan`, `media_player`, etc.
|
||||
- `area` *(optional)* — Filter by area/room name (matches against friendly names): `living room`, `kitchen`, `bedroom`, etc.
|
||||
|
||||
**Example:**
|
||||
```
|
||||
List all lights in the living room
|
||||
```
|
||||
|
||||
Returns entity IDs, states, and friendly names.
|
||||
|
||||
### `ha_get_state`
|
||||
|
||||
Get detailed state of a single entity, including all attributes (brightness, color, temperature setpoint, sensor readings, etc.).
|
||||
|
||||
**Parameters:**
|
||||
- `entity_id` *(required)* — The entity to query, e.g., `light.living_room`, `climate.thermostat`, `sensor.temperature`
|
||||
|
||||
**Example:**
|
||||
```
|
||||
What's the current state of climate.thermostat?
|
||||
```
|
||||
|
||||
Returns: state, all attributes, last changed/updated timestamps.
|
||||
|
||||
### `ha_list_services`
|
||||
|
||||
List available services (actions) for device control. Shows what actions can be performed on each device type and what parameters they accept.
|
||||
|
||||
**Parameters:**
|
||||
- `domain` *(optional)* — Filter by domain, e.g., `light`, `climate`, `switch`
|
||||
|
||||
**Example:**
|
||||
```
|
||||
What services are available for climate devices?
|
||||
```
|
||||
|
||||
### `ha_call_service`
|
||||
|
||||
Call a Home Assistant service to control a device.
|
||||
|
||||
**Parameters:**
|
||||
- `domain` *(required)* — Service domain: `light`, `switch`, `climate`, `cover`, `media_player`, `fan`, `scene`, `script`
|
||||
- `service` *(required)* — Service name: `turn_on`, `turn_off`, `toggle`, `set_temperature`, `set_hvac_mode`, `open_cover`, `close_cover`, `set_volume_level`
|
||||
- `entity_id` *(optional)* — Target entity, e.g., `light.living_room`
|
||||
- `data` *(optional)* — Additional parameters as a JSON object
|
||||
|
||||
**Examples:**
|
||||
|
||||
```
|
||||
Turn on the living room lights
|
||||
→ ha_call_service(domain="light", service="turn_on", entity_id="light.living_room")
|
||||
```
|
||||
|
||||
```
|
||||
Set the thermostat to 22 degrees in heat mode
|
||||
→ ha_call_service(domain="climate", service="set_temperature",
|
||||
entity_id="climate.thermostat", data={"temperature": 22, "hvac_mode": "heat"})
|
||||
```
|
||||
|
||||
```
|
||||
Set living room lights to blue at 50% brightness
|
||||
→ ha_call_service(domain="light", service="turn_on",
|
||||
entity_id="light.living_room", data={"brightness": 128, "color_name": "blue"})
|
||||
```
|
||||
|
||||
## Gateway Platform: Real-Time Events
|
||||
|
||||
The Home Assistant gateway adapter connects via WebSocket and subscribes to `state_changed` events. When a device state changes and matches your filters, it's forwarded to the agent as a message.
|
||||
|
||||
### Event Filtering
|
||||
|
||||
:::warning Required Configuration
|
||||
By default, **no events are forwarded**. You must configure at least one of `watch_domains`, `watch_entities`, or `watch_all` to receive events. Without filters, a warning is logged at startup and all state changes are silently dropped.
|
||||
:::
|
||||
|
||||
Configure which events the agent sees in `~/.hermes/gateway.json` under the Home Assistant platform's `extra` section:
|
||||
|
||||
```json
|
||||
{
|
||||
"platforms": {
|
||||
"homeassistant": {
|
||||
"enabled": true,
|
||||
"extra": {
|
||||
"watch_domains": ["climate", "binary_sensor", "alarm_control_panel", "light"],
|
||||
"watch_entities": ["sensor.front_door_battery"],
|
||||
"ignore_entities": ["sensor.uptime", "sensor.cpu_usage", "sensor.memory_usage"],
|
||||
"cooldown_seconds": 30
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Setting | Default | Description |
|
||||
|---------|---------|-------------|
|
||||
| `watch_domains` | *(none)* | Only watch these entity domains (e.g., `climate`, `light`, `binary_sensor`) |
|
||||
| `watch_entities` | *(none)* | Only watch these specific entity IDs |
|
||||
| `watch_all` | `false` | Set to `true` to receive **all** state changes (not recommended for most setups) |
|
||||
| `ignore_entities` | *(none)* | Always ignore these entities (applied before domain/entity filters) |
|
||||
| `cooldown_seconds` | `30` | Minimum seconds between events for the same entity |
|
||||
|
||||
:::tip
|
||||
Start with a focused set of domains — `climate`, `binary_sensor`, and `alarm_control_panel` cover the most useful automations. Add more as needed. Use `ignore_entities` to suppress noisy sensors like CPU temperature or uptime counters.
|
||||
:::
|
||||
|
||||
### Event Formatting
|
||||
|
||||
State changes are formatted as human-readable messages based on domain:
|
||||
|
||||
| Domain | Format |
|
||||
|--------|--------|
|
||||
| `climate` | "HVAC mode changed from 'off' to 'heat' (current: 21, target: 23)" |
|
||||
| `sensor` | "changed from 21°C to 22°C" |
|
||||
| `binary_sensor` | "triggered" / "cleared" |
|
||||
| `light`, `switch`, `fan` | "turned on" / "turned off" |
|
||||
| `alarm_control_panel` | "alarm state changed from 'armed_away' to 'triggered'" |
|
||||
| *(other)* | "changed from 'old' to 'new'" |
|
||||
|
||||
### Agent Responses
|
||||
|
||||
Outbound messages from the agent are delivered as **Home Assistant persistent notifications** (via `persistent_notification.create`). These appear in the HA notification panel with the title "Hermes Agent".
|
||||
|
||||
### Connection Management
|
||||
|
||||
- **WebSocket** with 30-second heartbeat for real-time events
|
||||
- **Automatic reconnection** with backoff: 5s → 10s → 30s → 60s
|
||||
- **REST API** for outbound notifications (separate session to avoid WebSocket conflicts)
|
||||
- **Authorization** — HA events are always authorized (no user allowlist needed, since the `HASS_TOKEN` authenticates the connection)
|
||||
|
||||
## Security
|
||||
|
||||
The Home Assistant tools enforce security restrictions:
|
||||
|
||||
:::warning Blocked Domains
|
||||
The following service domains are **blocked** to prevent arbitrary code execution on the HA host:
|
||||
|
||||
- `shell_command` — arbitrary shell commands
|
||||
- `command_line` — sensors/switches that execute commands
|
||||
- `python_script` — scripted Python execution
|
||||
- `pyscript` — broader scripting integration
|
||||
- `hassio` — addon control, host shutdown/reboot
|
||||
- `rest_command` — HTTP requests from HA server (SSRF vector)
|
||||
|
||||
Attempting to call services in these domains returns an error.
|
||||
:::
|
||||
|
||||
Entity IDs are validated against the pattern `^[a-z_][a-z0-9_]*\.[a-z0-9_]+$` to prevent injection attacks.
|
||||
|
||||
## Example Automations
|
||||
|
||||
### Morning Routine
|
||||
|
||||
```
|
||||
User: Start my morning routine
|
||||
|
||||
Agent:
|
||||
1. ha_call_service(domain="light", service="turn_on",
|
||||
entity_id="light.bedroom", data={"brightness": 128})
|
||||
2. ha_call_service(domain="climate", service="set_temperature",
|
||||
entity_id="climate.thermostat", data={"temperature": 22})
|
||||
3. ha_call_service(domain="media_player", service="turn_on",
|
||||
entity_id="media_player.kitchen_speaker")
|
||||
```
|
||||
|
||||
### Security Check
|
||||
|
||||
```
|
||||
User: Is the house secure?
|
||||
|
||||
Agent:
|
||||
1. ha_list_entities(domain="binary_sensor")
|
||||
→ checks door/window sensors
|
||||
2. ha_get_state(entity_id="alarm_control_panel.home")
|
||||
→ checks alarm status
|
||||
3. ha_list_entities(domain="lock")
|
||||
→ checks lock states
|
||||
4. Reports: "All doors closed, alarm is armed_away, all locks engaged."
|
||||
```
|
||||
|
||||
### Reactive Automation (via Gateway Events)
|
||||
|
||||
When connected as a gateway platform, the agent can react to events:
|
||||
|
||||
```
|
||||
[Home Assistant] Front Door: triggered (was cleared)
|
||||
|
||||
Agent automatically:
|
||||
1. ha_get_state(entity_id="binary_sensor.front_door")
|
||||
2. ha_call_service(domain="light", service="turn_on",
|
||||
entity_id="light.hallway")
|
||||
3. Sends notification: "Front door opened. Hallway lights turned on."
|
||||
```
|
||||
332
hermes_code/website/docs/user-guide/messaging/index.md
Normal file
332
hermes_code/website/docs/user-guide/messaging/index.md
Normal file
|
|
@ -0,0 +1,332 @@
|
|||
---
|
||||
sidebar_position: 1
|
||||
title: "Messaging Gateway"
|
||||
description: "Chat with Hermes from Telegram, Discord, Slack, WhatsApp, Signal, SMS, Email, Home Assistant, Mattermost, Matrix, DingTalk, Webhooks, or any OpenAI-compatible frontend via the API server — architecture and setup overview"
|
||||
---
|
||||
|
||||
# Messaging Gateway
|
||||
|
||||
Chat with Hermes from Telegram, Discord, Slack, WhatsApp, Signal, SMS, Email, Home Assistant, Mattermost, Matrix, DingTalk, or your browser. The gateway is a single background process that connects to all your configured platforms, handles sessions, runs cron jobs, and delivers voice messages.
|
||||
|
||||
For the full voice feature set — including CLI microphone mode, spoken replies in messaging, and Discord voice-channel conversations — see [Voice Mode](/docs/user-guide/features/voice-mode) and [Use Voice Mode with Hermes](/docs/guides/use-voice-mode-with-hermes).
|
||||
|
||||
## Architecture
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph Gateway["Hermes Gateway"]
|
||||
subgraph Adapters["Platform adapters"]
|
||||
tg[Telegram]
|
||||
dc[Discord]
|
||||
wa[WhatsApp]
|
||||
sl[Slack]
|
||||
sig[Signal]
|
||||
sms[SMS]
|
||||
em[Email]
|
||||
ha[Home Assistant]
|
||||
mm[Mattermost]
|
||||
mx[Matrix]
|
||||
dt[DingTalk]
|
||||
api["API Server<br/>(OpenAI-compatible)"]
|
||||
wh[Webhooks]
|
||||
end
|
||||
|
||||
store["Session store<br/>per chat"]
|
||||
agent["AIAgent<br/>run_agent.py"]
|
||||
cron["Cron scheduler<br/>ticks every 60s"]
|
||||
end
|
||||
|
||||
tg --> store
|
||||
dc --> store
|
||||
wa --> store
|
||||
sl --> store
|
||||
sig --> store
|
||||
sms --> store
|
||||
em --> store
|
||||
ha --> store
|
||||
mm --> store
|
||||
mx --> store
|
||||
dt --> store
|
||||
api --> store
|
||||
wh --> store
|
||||
store --> agent
|
||||
cron --> store
|
||||
```
|
||||
|
||||
Each platform adapter receives messages, routes them through a per-chat session store, and dispatches them to the AIAgent for processing. The gateway also runs the cron scheduler, ticking every 60 seconds to execute any due jobs.
|
||||
|
||||
## Quick Setup
|
||||
|
||||
The easiest way to configure messaging platforms is the interactive wizard:
|
||||
|
||||
```bash
|
||||
hermes gateway setup # Interactive setup for all messaging platforms
|
||||
```
|
||||
|
||||
This walks you through configuring each platform with arrow-key selection, shows which platforms are already configured, and offers to start/restart the gateway when done.
|
||||
|
||||
## Gateway Commands
|
||||
|
||||
```bash
|
||||
hermes gateway # Run in foreground
|
||||
hermes gateway setup # Configure messaging platforms interactively
|
||||
hermes gateway install # Install as a user service (Linux) / launchd service (macOS)
|
||||
sudo hermes gateway install --system # Linux only: install a boot-time system service
|
||||
hermes gateway start # Start the default service
|
||||
hermes gateway stop # Stop the default service
|
||||
hermes gateway status # Check default service status
|
||||
hermes gateway status --system # Linux only: inspect the system service explicitly
|
||||
```
|
||||
|
||||
## Chat Commands (Inside Messaging)
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/new` or `/reset` | Start a fresh conversation |
|
||||
| `/model [provider:model]` | Show or change the model (supports `provider:model` syntax) |
|
||||
| `/provider` | Show available providers with auth status |
|
||||
| `/personality [name]` | Set a personality |
|
||||
| `/retry` | Retry the last message |
|
||||
| `/undo` | Remove the last exchange |
|
||||
| `/status` | Show session info |
|
||||
| `/stop` | Stop the running agent |
|
||||
| `/approve` | Approve a pending dangerous command |
|
||||
| `/deny` | Reject a pending dangerous command |
|
||||
| `/sethome` | Set this chat as the home channel |
|
||||
| `/compress` | Manually compress conversation context |
|
||||
| `/title [name]` | Set or show the session title |
|
||||
| `/resume [name]` | Resume a previously named session |
|
||||
| `/usage` | Show token usage for this session |
|
||||
| `/insights [days]` | Show usage insights and analytics |
|
||||
| `/reasoning [level\|show\|hide]` | Change reasoning effort or toggle reasoning display |
|
||||
| `/voice [on\|off\|tts\|join\|leave\|status]` | Control messaging voice replies and Discord voice-channel behavior |
|
||||
| `/rollback [number]` | List or restore filesystem checkpoints |
|
||||
| `/background <prompt>` | Run a prompt in a separate background session |
|
||||
| `/reload-mcp` | Reload MCP servers from config |
|
||||
| `/update` | Update Hermes Agent to the latest version |
|
||||
| `/help` | Show available commands |
|
||||
| `/<skill-name>` | Invoke any installed skill |
|
||||
|
||||
## Session Management
|
||||
|
||||
### Session Persistence
|
||||
|
||||
Sessions persist across messages until they reset. The agent remembers your conversation context.
|
||||
|
||||
### Reset Policies
|
||||
|
||||
Sessions reset based on configurable policies:
|
||||
|
||||
| Policy | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| Daily | 4:00 AM | Reset at a specific hour each day |
|
||||
| Idle | 1440 min | Reset after N minutes of inactivity |
|
||||
| Both | (combined) | Whichever triggers first |
|
||||
|
||||
Configure per-platform overrides in `~/.hermes/gateway.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"reset_by_platform": {
|
||||
"telegram": { "mode": "idle", "idle_minutes": 240 },
|
||||
"discord": { "mode": "idle", "idle_minutes": 60 }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Security
|
||||
|
||||
**By default, the gateway denies all users who are not in an allowlist or paired via DM.** This is the safe default for a bot with terminal access.
|
||||
|
||||
```bash
|
||||
# Restrict to specific users (recommended):
|
||||
TELEGRAM_ALLOWED_USERS=123456789,987654321
|
||||
DISCORD_ALLOWED_USERS=123456789012345678
|
||||
SIGNAL_ALLOWED_USERS=+155****4567,+155****6543
|
||||
SMS_ALLOWED_USERS=+155****4567,+155****6543
|
||||
EMAIL_ALLOWED_USERS=trusted@example.com,colleague@work.com
|
||||
MATTERMOST_ALLOWED_USERS=3uo8dkh1p7g1mfk49ear5fzs5c
|
||||
MATRIX_ALLOWED_USERS=@alice:matrix.org
|
||||
DINGTALK_ALLOWED_USERS=user-id-1
|
||||
|
||||
# Or allow
|
||||
GATEWAY_ALLOWED_USERS=123456789,987654321
|
||||
|
||||
# Or explicitly allow all users (NOT recommended for bots with terminal access):
|
||||
GATEWAY_ALLOW_ALL_USERS=true
|
||||
```
|
||||
|
||||
### DM Pairing (Alternative to Allowlists)
|
||||
|
||||
Instead of manually configuring user IDs, unknown users receive a one-time pairing code when they DM the bot:
|
||||
|
||||
```bash
|
||||
# The user sees: "Pairing code: XKGH5N7P"
|
||||
# You approve them with:
|
||||
hermes pairing approve telegram XKGH5N7P
|
||||
|
||||
# Other pairing commands:
|
||||
hermes pairing list # View pending + approved users
|
||||
hermes pairing revoke telegram 123456789 # Remove access
|
||||
```
|
||||
|
||||
Pairing codes expire after 1 hour, are rate-limited, and use cryptographic randomness.
|
||||
|
||||
## Interrupting the Agent
|
||||
|
||||
Send any message while the agent is working to interrupt it. Key behaviors:
|
||||
|
||||
- **In-progress terminal commands are killed immediately** (SIGTERM, then SIGKILL after 1s)
|
||||
- **Tool calls are cancelled** — only the currently-executing one runs, the rest are skipped
|
||||
- **Multiple messages are combined** — messages sent during interruption are joined into one prompt
|
||||
- **`/stop` command** — interrupts without queuing a follow-up message
|
||||
|
||||
## Tool Progress Notifications
|
||||
|
||||
Control how much tool activity is displayed in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
display:
|
||||
tool_progress: all # off | new | all | verbose
|
||||
```
|
||||
|
||||
When enabled, the bot sends status messages as it works:
|
||||
|
||||
```text
|
||||
💻 `ls -la`...
|
||||
🔍 web_search...
|
||||
📄 web_extract...
|
||||
🐍 execute_code...
|
||||
```
|
||||
|
||||
## Background Sessions
|
||||
|
||||
Run a prompt in a separate background session so the agent works on it independently while your main chat stays responsive:
|
||||
|
||||
```
|
||||
/background Check all servers in the cluster and report any that are down
|
||||
```
|
||||
|
||||
Hermes confirms immediately:
|
||||
|
||||
```
|
||||
🔄 Background task started: "Check all servers in the cluster..."
|
||||
Task ID: bg_143022_a1b2c3
|
||||
```
|
||||
|
||||
### How It Works
|
||||
|
||||
Each `/background` prompt spawns a **separate agent instance** that runs asynchronously:
|
||||
|
||||
- **Isolated session** — the background agent has its own session with its own conversation history. It has no knowledge of your current chat context and receives only the prompt you provide.
|
||||
- **Same configuration** — inherits your model, provider, toolsets, reasoning settings, and provider routing from the current gateway setup.
|
||||
- **Non-blocking** — your main chat stays fully interactive. Send messages, run other commands, or start more background tasks while it works.
|
||||
- **Result delivery** — when the task finishes, the result is sent back to the **same chat or channel** where you issued the command, prefixed with "✅ Background task complete". If it fails, you'll see "❌ Background task failed" with the error.
|
||||
|
||||
### Background Process Notifications
|
||||
|
||||
When the agent running a background session uses `terminal(background=true)` to start long-running processes (servers, builds, etc.), the gateway can push status updates to your chat. Control this with `display.background_process_notifications` in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
display:
|
||||
background_process_notifications: all # all | result | error | off
|
||||
```
|
||||
|
||||
| Mode | What you receive |
|
||||
|------|-----------------|
|
||||
| `all` | Running-output updates **and** the final completion message (default) |
|
||||
| `result` | Only the final completion message (regardless of exit code) |
|
||||
| `error` | Only the final message when the exit code is non-zero |
|
||||
| `off` | No process watcher messages at all |
|
||||
|
||||
You can also set this via environment variable:
|
||||
|
||||
```bash
|
||||
HERMES_BACKGROUND_NOTIFICATIONS=result
|
||||
```
|
||||
|
||||
### Use Cases
|
||||
|
||||
- **Server monitoring** — "/background Check the health of all services and alert me if anything is down"
|
||||
- **Long builds** — "/background Build and deploy the staging environment" while you continue chatting
|
||||
- **Research tasks** — "/background Research competitor pricing and summarize in a table"
|
||||
- **File operations** — "/background Organize the photos in ~/Downloads by date into folders"
|
||||
|
||||
:::tip
|
||||
Background tasks on messaging platforms are fire-and-forget — you don't need to wait or check on them. Results arrive in the same chat automatically when the task finishes.
|
||||
:::
|
||||
|
||||
## Service Management
|
||||
|
||||
### Linux (systemd)
|
||||
|
||||
```bash
|
||||
hermes gateway install # Install as user service
|
||||
hermes gateway start # Start the service
|
||||
hermes gateway stop # Stop the service
|
||||
hermes gateway status # Check status
|
||||
journalctl --user -u hermes-gateway -f # View logs
|
||||
|
||||
# Enable lingering (keeps running after logout)
|
||||
sudo loginctl enable-linger $USER
|
||||
|
||||
# Or install a boot-time system service that still runs as your user
|
||||
sudo hermes gateway install --system
|
||||
sudo hermes gateway start --system
|
||||
sudo hermes gateway status --system
|
||||
journalctl -u hermes-gateway -f
|
||||
```
|
||||
|
||||
Use the user service on laptops and dev boxes. Use the system service on VPS or headless hosts that should come back at boot without relying on systemd linger.
|
||||
|
||||
Avoid keeping both the user and system gateway units installed at once unless you really mean to. Hermes will warn if it detects both because start/stop/status behavior gets ambiguous.
|
||||
|
||||
:::info Multiple installations
|
||||
If you run multiple Hermes installations on the same machine (with different `HERMES_HOME` directories), each gets its own systemd service name. The default `~/.hermes` uses `hermes-gateway`; other installations use `hermes-gateway-<hash>`. The `hermes gateway` commands automatically target the correct service for your current `HERMES_HOME`.
|
||||
:::
|
||||
|
||||
### macOS (launchd)
|
||||
|
||||
```bash
|
||||
hermes gateway install
|
||||
launchctl start ai.hermes.gateway
|
||||
launchctl stop ai.hermes.gateway
|
||||
tail -f ~/.hermes/logs/gateway.log
|
||||
```
|
||||
|
||||
## Platform-Specific Toolsets
|
||||
|
||||
Each platform has its own toolset:
|
||||
|
||||
| Platform | Toolset | Capabilities |
|
||||
|----------|---------|--------------|
|
||||
| CLI | `hermes-cli` | Full access |
|
||||
| Telegram | `hermes-telegram` | Full tools including terminal |
|
||||
| Discord | `hermes-discord` | Full tools including terminal |
|
||||
| WhatsApp | `hermes-whatsapp` | Full tools including terminal |
|
||||
| Slack | `hermes-slack` | Full tools including terminal |
|
||||
| Signal | `hermes-signal` | Full tools including terminal |
|
||||
| SMS | `hermes-sms` | Full tools including terminal |
|
||||
| Email | `hermes-email` | Full tools including terminal |
|
||||
| Home Assistant | `hermes-homeassistant` | Full tools + HA device control (ha_list_entities, ha_get_state, ha_call_service, ha_list_services) |
|
||||
| Mattermost | `hermes-mattermost` | Full tools including terminal |
|
||||
| Matrix | `hermes-matrix` | Full tools including terminal |
|
||||
| DingTalk | `hermes-dingtalk` | Full tools including terminal |
|
||||
| API Server | `hermes` (default) | Full tools including terminal |
|
||||
| Webhooks | `hermes-webhook` | Full tools including terminal |
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Telegram Setup](telegram.md)
|
||||
- [Discord Setup](discord.md)
|
||||
- [Slack Setup](slack.md)
|
||||
- [WhatsApp Setup](whatsapp.md)
|
||||
- [Signal Setup](signal.md)
|
||||
- [SMS Setup (Twilio)](sms.md)
|
||||
- [Email Setup](email.md)
|
||||
- [Home Assistant Integration](homeassistant.md)
|
||||
- [Mattermost Setup](mattermost.md)
|
||||
- [Matrix Setup](matrix.md)
|
||||
- [DingTalk Setup](dingtalk.md)
|
||||
- [Open WebUI + API Server](open-webui.md)
|
||||
- [Webhooks](webhooks.md)
|
||||
354
hermes_code/website/docs/user-guide/messaging/matrix.md
Normal file
354
hermes_code/website/docs/user-guide/messaging/matrix.md
Normal file
|
|
@ -0,0 +1,354 @@
|
|||
---
|
||||
sidebar_position: 9
|
||||
title: "Matrix"
|
||||
description: "Set up Hermes Agent as a Matrix bot"
|
||||
---
|
||||
|
||||
# Matrix Setup
|
||||
|
||||
Hermes Agent integrates with Matrix, the open, federated messaging protocol. Matrix lets you run your own homeserver or use a public one like matrix.org — either way, you keep control of your communications. The bot connects via the `matrix-nio` Python SDK, processes messages through the Hermes Agent pipeline (including tool use, memory, and reasoning), and responds in real time. It supports text, file attachments, images, audio, video, and optional end-to-end encryption (E2EE).
|
||||
|
||||
Hermes works with any Matrix homeserver — Synapse, Conduit, Dendrite, or matrix.org.
|
||||
|
||||
Before setup, here's the part most people want to know: how Hermes behaves once it's connected.
|
||||
|
||||
## How Hermes Behaves
|
||||
|
||||
| Context | Behavior |
|
||||
|---------|----------|
|
||||
| **DMs** | Hermes responds to every message. No `@mention` needed. Each DM has its own session. |
|
||||
| **Rooms** | Hermes responds to all messages in rooms it has joined. Room invites are auto-accepted. |
|
||||
| **Threads** | Hermes supports Matrix threads (MSC3440). If you reply in a thread, Hermes keeps the thread context isolated from the main room timeline. |
|
||||
| **Shared rooms with multiple users** | By default, Hermes isolates session history per user inside the room. Two people talking in the same room do not share one transcript unless you explicitly disable that. |
|
||||
|
||||
:::tip
|
||||
The bot automatically joins rooms when invited. Just invite the bot's Matrix user to any room and it will join and start responding.
|
||||
:::
|
||||
|
||||
### Session Model in Matrix
|
||||
|
||||
By default:
|
||||
|
||||
- each DM gets its own session
|
||||
- each thread gets its own session namespace
|
||||
- each user in a shared room gets their own session inside that room
|
||||
|
||||
This is controlled by `config.yaml`:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: true
|
||||
```
|
||||
|
||||
Set it to `false` only if you explicitly want one shared conversation for the entire room:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: false
|
||||
```
|
||||
|
||||
Shared sessions can be useful for a collaborative room, but they also mean:
|
||||
|
||||
- users share context growth and token costs
|
||||
- one person's long tool-heavy task can bloat everyone else's context
|
||||
- one person's in-flight run can interrupt another person's follow-up in the same room
|
||||
|
||||
This guide walks you through the full setup process — from creating your bot account to sending your first message.
|
||||
|
||||
## Step 1: Create a Bot Account
|
||||
|
||||
You need a Matrix user account for the bot. There are several ways to do this:
|
||||
|
||||
### Option A: Register on Your Homeserver (Recommended)
|
||||
|
||||
If you run your own homeserver (Synapse, Conduit, Dendrite):
|
||||
|
||||
1. Use the admin API or registration tool to create a new user:
|
||||
|
||||
```bash
|
||||
# Synapse example
|
||||
register_new_matrix_user -c /etc/synapse/homeserver.yaml http://localhost:8008
|
||||
```
|
||||
|
||||
2. Choose a username like `hermes` — the full user ID will be `@hermes:your-server.org`.
|
||||
|
||||
### Option B: Use matrix.org or Another Public Homeserver
|
||||
|
||||
1. Go to [Element Web](https://app.element.io) and create a new account.
|
||||
2. Pick a username for your bot (e.g., `hermes-bot`).
|
||||
|
||||
### Option C: Use Your Own Account
|
||||
|
||||
You can also run Hermes as your own user. This means the bot posts as you — useful for personal assistants.
|
||||
|
||||
## Step 2: Get an Access Token
|
||||
|
||||
Hermes needs an access token to authenticate with the homeserver. You have two options:
|
||||
|
||||
### Option A: Access Token (Recommended)
|
||||
|
||||
The most reliable way to get a token:
|
||||
|
||||
**Via Element:**
|
||||
1. Log in to [Element](https://app.element.io) with the bot account.
|
||||
2. Go to **Settings** → **Help & About**.
|
||||
3. Scroll down and expand **Advanced** — the access token is displayed there.
|
||||
4. **Copy it immediately.**
|
||||
|
||||
**Via the API:**
|
||||
|
||||
```bash
|
||||
curl -X POST https://your-server/_matrix/client/v3/login \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"type": "m.login.password",
|
||||
"user": "@hermes:your-server.org",
|
||||
"password": "your-password"
|
||||
}'
|
||||
```
|
||||
|
||||
The response includes an `access_token` field — copy it.
|
||||
|
||||
:::warning[Keep your access token safe]
|
||||
The access token gives full access to the bot's Matrix account. Never share it publicly or commit it to Git. If compromised, revoke it by logging out all sessions for that user.
|
||||
:::
|
||||
|
||||
### Option B: Password Login
|
||||
|
||||
Instead of providing an access token, you can give Hermes the bot's user ID and password. Hermes will log in automatically on startup. This is simpler but means the password is stored in your `.env` file.
|
||||
|
||||
```bash
|
||||
MATRIX_USER_ID=@hermes:your-server.org
|
||||
MATRIX_PASSWORD=your-password
|
||||
```
|
||||
|
||||
## Step 3: Find Your Matrix User ID
|
||||
|
||||
Hermes Agent uses your Matrix User ID to control who can interact with the bot. Matrix User IDs follow the format `@username:server`.
|
||||
|
||||
To find yours:
|
||||
|
||||
1. Open [Element](https://app.element.io) (or your preferred Matrix client).
|
||||
2. Click your avatar → **Settings**.
|
||||
3. Your User ID is displayed at the top of the profile (e.g., `@alice:matrix.org`).
|
||||
|
||||
:::tip
|
||||
Matrix User IDs always start with `@` and contain a `:` followed by the server name. For example: `@alice:matrix.org`, `@bob:your-server.com`.
|
||||
:::
|
||||
|
||||
## Step 4: Configure Hermes Agent
|
||||
|
||||
### Option A: Interactive Setup (Recommended)
|
||||
|
||||
Run the guided setup command:
|
||||
|
||||
```bash
|
||||
hermes gateway setup
|
||||
```
|
||||
|
||||
Select **Matrix** when prompted, then provide your homeserver URL, access token (or user ID + password), and allowed user IDs when asked.
|
||||
|
||||
### Option B: Manual Configuration
|
||||
|
||||
Add the following to your `~/.hermes/.env` file:
|
||||
|
||||
**Using an access token:**
|
||||
|
||||
```bash
|
||||
# Required
|
||||
MATRIX_HOMESERVER=https://matrix.example.org
|
||||
MATRIX_ACCESS_TOKEN=***
|
||||
|
||||
# Optional: user ID (auto-detected from token if omitted)
|
||||
# MATRIX_USER_ID=@hermes:matrix.example.org
|
||||
|
||||
# Security: restrict who can interact with the bot
|
||||
MATRIX_ALLOWED_USERS=@alice:matrix.example.org
|
||||
|
||||
# Multiple allowed users (comma-separated)
|
||||
# MATRIX_ALLOWED_USERS=@alice:matrix.example.org,@bob:matrix.example.org
|
||||
```
|
||||
|
||||
**Using password login:**
|
||||
|
||||
```bash
|
||||
# Required
|
||||
MATRIX_HOMESERVER=https://matrix.example.org
|
||||
MATRIX_USER_ID=@hermes:matrix.example.org
|
||||
MATRIX_PASSWORD=***
|
||||
|
||||
# Security
|
||||
MATRIX_ALLOWED_USERS=@alice:matrix.example.org
|
||||
```
|
||||
|
||||
Optional behavior settings in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: true
|
||||
```
|
||||
|
||||
- `group_sessions_per_user: true` keeps each participant's context isolated inside shared rooms
|
||||
|
||||
### Start the Gateway
|
||||
|
||||
Once configured, start the Matrix gateway:
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
The bot should connect to your homeserver and start syncing within a few seconds. Send it a message — either a DM or in a room it has joined — to test.
|
||||
|
||||
:::tip
|
||||
You can run `hermes gateway` in the background or as a systemd service for persistent operation. See the deployment docs for details.
|
||||
:::
|
||||
|
||||
## End-to-End Encryption (E2EE)
|
||||
|
||||
Hermes supports Matrix end-to-end encryption, so you can chat with your bot in encrypted rooms.
|
||||
|
||||
### Requirements
|
||||
|
||||
E2EE requires the `matrix-nio` library with encryption extras and the `libolm` C library:
|
||||
|
||||
```bash
|
||||
# Install matrix-nio with E2EE support
|
||||
pip install 'matrix-nio[e2e]'
|
||||
|
||||
# Or install with hermes extras
|
||||
pip install 'hermes-agent[matrix]'
|
||||
```
|
||||
|
||||
You also need `libolm` installed on your system:
|
||||
|
||||
```bash
|
||||
# Debian/Ubuntu
|
||||
sudo apt install libolm-dev
|
||||
|
||||
# macOS
|
||||
brew install libolm
|
||||
|
||||
# Fedora
|
||||
sudo dnf install libolm-devel
|
||||
```
|
||||
|
||||
### Enable E2EE
|
||||
|
||||
Add to your `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
MATRIX_ENCRYPTION=true
|
||||
```
|
||||
|
||||
When E2EE is enabled, Hermes:
|
||||
|
||||
- Stores encryption keys in `~/.hermes/matrix/store/`
|
||||
- Uploads device keys on first connection
|
||||
- Decrypts incoming messages and encrypts outgoing messages automatically
|
||||
- Auto-joins encrypted rooms when invited
|
||||
|
||||
:::warning
|
||||
If you delete the `~/.hermes/matrix/store/` directory, the bot loses its encryption keys. You'll need to verify the device again in your Matrix client. Back up this directory if you want to preserve encrypted sessions.
|
||||
:::
|
||||
|
||||
:::info
|
||||
If `matrix-nio[e2e]` is not installed or `libolm` is missing, the bot falls back to a plain (unencrypted) client automatically. You'll see a warning in the logs.
|
||||
:::
|
||||
|
||||
## Home Room
|
||||
|
||||
You can designate a "home room" where the bot sends proactive messages (such as cron job output, reminders, and notifications). There are two ways to set it:
|
||||
|
||||
### Using the Slash Command
|
||||
|
||||
Type `/sethome` in any Matrix room where the bot is present. That room becomes the home room.
|
||||
|
||||
### Manual Configuration
|
||||
|
||||
Add this to your `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
MATRIX_HOME_ROOM=!abc123def456:matrix.example.org
|
||||
```
|
||||
|
||||
:::tip
|
||||
To find a Room ID: in Element, go to the room → **Settings** → **Advanced** → the **Internal room ID** is shown there (starts with `!`).
|
||||
:::
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Bot is not responding to messages
|
||||
|
||||
**Cause**: The bot hasn't joined the room, or `MATRIX_ALLOWED_USERS` doesn't include your User ID.
|
||||
|
||||
**Fix**: Invite the bot to the room — it auto-joins on invite. Verify your User ID is in `MATRIX_ALLOWED_USERS` (use the full `@user:server` format). Restart the gateway.
|
||||
|
||||
### "Failed to authenticate" / "whoami failed" on startup
|
||||
|
||||
**Cause**: The access token or homeserver URL is incorrect.
|
||||
|
||||
**Fix**: Verify `MATRIX_HOMESERVER` points to your homeserver (include `https://`, no trailing slash). Check that `MATRIX_ACCESS_TOKEN` is valid — try it with curl:
|
||||
|
||||
```bash
|
||||
curl -H "Authorization: Bearer YOUR_TOKEN" \
|
||||
https://your-server/_matrix/client/v3/account/whoami
|
||||
```
|
||||
|
||||
If this returns your user info, the token is valid. If it returns an error, generate a new token.
|
||||
|
||||
### "matrix-nio not installed" error
|
||||
|
||||
**Cause**: The `matrix-nio` Python package is not installed.
|
||||
|
||||
**Fix**: Install it:
|
||||
|
||||
```bash
|
||||
pip install 'matrix-nio[e2e]'
|
||||
```
|
||||
|
||||
Or with Hermes extras:
|
||||
|
||||
```bash
|
||||
pip install 'hermes-agent[matrix]'
|
||||
```
|
||||
|
||||
### Encryption errors / "could not decrypt event"
|
||||
|
||||
**Cause**: Missing encryption keys, `libolm` not installed, or the bot's device isn't trusted.
|
||||
|
||||
**Fix**:
|
||||
1. Verify `libolm` is installed on your system (see the E2EE section above).
|
||||
2. Make sure `MATRIX_ENCRYPTION=true` is set in your `.env`.
|
||||
3. In your Matrix client (Element), go to the bot's profile → **Sessions** → verify/trust the bot's device.
|
||||
4. If the bot just joined an encrypted room, it can only decrypt messages sent *after* it joined. Older messages are inaccessible.
|
||||
|
||||
### Sync issues / bot falls behind
|
||||
|
||||
**Cause**: Long-running tool executions can delay the sync loop, or the homeserver is slow.
|
||||
|
||||
**Fix**: The sync loop automatically retries every 5 seconds on error. Check the Hermes logs for sync-related warnings. If the bot consistently falls behind, ensure your homeserver has adequate resources.
|
||||
|
||||
### Bot is offline
|
||||
|
||||
**Cause**: The Hermes gateway isn't running, or it failed to connect.
|
||||
|
||||
**Fix**: Check that `hermes gateway` is running. Look at the terminal output for error messages. Common issues: wrong homeserver URL, expired access token, homeserver unreachable.
|
||||
|
||||
### "User not allowed" / Bot ignores you
|
||||
|
||||
**Cause**: Your User ID isn't in `MATRIX_ALLOWED_USERS`.
|
||||
|
||||
**Fix**: Add your User ID to `MATRIX_ALLOWED_USERS` in `~/.hermes/.env` and restart the gateway. Use the full `@user:server` format.
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
Always set `MATRIX_ALLOWED_USERS` to restrict who can interact with the bot. Without it, the gateway denies all users by default as a safety measure. Only add User IDs of people you trust — authorized users have full access to the agent's capabilities, including tool use and system access.
|
||||
:::
|
||||
|
||||
For more information on securing your Hermes Agent deployment, see the [Security Guide](../security.md).
|
||||
|
||||
## Notes
|
||||
|
||||
- **Any homeserver**: Works with Synapse, Conduit, Dendrite, matrix.org, or any spec-compliant Matrix homeserver. No specific homeserver software required.
|
||||
- **Federation**: If you're on a federated homeserver, the bot can communicate with users from other servers — just add their full `@user:server` IDs to `MATRIX_ALLOWED_USERS`.
|
||||
- **Auto-join**: The bot automatically accepts room invites and joins. It starts responding immediately after joining.
|
||||
- **Media support**: Hermes can send and receive images, audio, video, and file attachments. Media is uploaded to your homeserver using the Matrix content repository API.
|
||||
277
hermes_code/website/docs/user-guide/messaging/mattermost.md
Normal file
277
hermes_code/website/docs/user-guide/messaging/mattermost.md
Normal file
|
|
@ -0,0 +1,277 @@
|
|||
---
|
||||
sidebar_position: 8
|
||||
title: "Mattermost"
|
||||
description: "Set up Hermes Agent as a Mattermost bot"
|
||||
---
|
||||
|
||||
# Mattermost Setup
|
||||
|
||||
Hermes Agent integrates with Mattermost as a bot, letting you chat with your AI assistant through direct messages or team channels. Mattermost is a self-hosted, open-source Slack alternative — you run it on your own infrastructure, keeping full control of your data. The bot connects via Mattermost's REST API (v4) and WebSocket for real-time events, processes messages through the Hermes Agent pipeline (including tool use, memory, and reasoning), and responds in real time. It supports text, file attachments, images, and slash commands.
|
||||
|
||||
No external Mattermost library is required — the adapter uses `aiohttp`, which is already a Hermes dependency.
|
||||
|
||||
Before setup, here's the part most people want to know: how Hermes behaves once it's in your Mattermost instance.
|
||||
|
||||
## How Hermes Behaves
|
||||
|
||||
| Context | Behavior |
|
||||
|---------|----------|
|
||||
| **DMs** | Hermes responds to every message. No `@mention` needed. Each DM has its own session. |
|
||||
| **Public/private channels** | Hermes responds when you `@mention` it. Without a mention, Hermes ignores the message. |
|
||||
| **Threads** | If `MATTERMOST_REPLY_MODE=thread`, Hermes replies in a thread under your message. Thread context stays isolated from the parent channel. |
|
||||
| **Shared channels with multiple users** | By default, Hermes isolates session history per user inside the channel. Two people talking in the same channel do not share one transcript unless you explicitly disable that. |
|
||||
|
||||
:::tip
|
||||
If you want Hermes to reply as threaded conversations (nested under your original message), set `MATTERMOST_REPLY_MODE=thread`. The default is `off`, which sends flat messages in the channel.
|
||||
:::
|
||||
|
||||
### Session Model in Mattermost
|
||||
|
||||
By default:
|
||||
|
||||
- each DM gets its own session
|
||||
- each thread gets its own session namespace
|
||||
- each user in a shared channel gets their own session inside that channel
|
||||
|
||||
This is controlled by `config.yaml`:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: true
|
||||
```
|
||||
|
||||
Set it to `false` only if you explicitly want one shared conversation for the entire channel:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: false
|
||||
```
|
||||
|
||||
Shared sessions can be useful for a collaborative channel, but they also mean:
|
||||
|
||||
- users share context growth and token costs
|
||||
- one person's long tool-heavy task can bloat everyone else's context
|
||||
- one person's in-flight run can interrupt another person's follow-up in the same channel
|
||||
|
||||
This guide walks you through the full setup process — from creating your bot on Mattermost to sending your first message.
|
||||
|
||||
## Step 1: Enable Bot Accounts
|
||||
|
||||
Bot accounts must be enabled on your Mattermost server before you can create one.
|
||||
|
||||
1. Log in to Mattermost as a **System Admin**.
|
||||
2. Go to **System Console** → **Integrations** → **Bot Accounts**.
|
||||
3. Set **Enable Bot Account Creation** to **true**.
|
||||
4. Click **Save**.
|
||||
|
||||
:::info
|
||||
If you don't have System Admin access, ask your Mattermost administrator to enable bot accounts and create one for you.
|
||||
:::
|
||||
|
||||
## Step 2: Create a Bot Account
|
||||
|
||||
1. In Mattermost, click the **☰** menu (top-left) → **Integrations** → **Bot Accounts**.
|
||||
2. Click **Add Bot Account**.
|
||||
3. Fill in the details:
|
||||
- **Username**: e.g., `hermes`
|
||||
- **Display Name**: e.g., `Hermes Agent`
|
||||
- **Description**: optional
|
||||
- **Role**: `Member` is sufficient
|
||||
4. Click **Create Bot Account**.
|
||||
5. Mattermost will display the **bot token**. **Copy it immediately.**
|
||||
|
||||
:::warning[Token shown only once]
|
||||
The bot token is only displayed once when you create the bot account. If you lose it, you'll need to regenerate it from the bot account settings. Never share your token publicly or commit it to Git — anyone with this token has full control of the bot.
|
||||
:::
|
||||
|
||||
Store the token somewhere safe (a password manager, for example). You'll need it in Step 5.
|
||||
|
||||
:::tip
|
||||
You can also use a **personal access token** instead of a bot account. Go to **Profile** → **Security** → **Personal Access Tokens** → **Create Token**. This is useful if you want Hermes to post as your own user rather than a separate bot user.
|
||||
:::
|
||||
|
||||
## Step 3: Add the Bot to Channels
|
||||
|
||||
The bot needs to be a member of any channel where you want it to respond:
|
||||
|
||||
1. Open the channel where you want the bot.
|
||||
2. Click the channel name → **Add Members**.
|
||||
3. Search for your bot username (e.g., `hermes`) and add it.
|
||||
|
||||
For DMs, simply open a direct message with the bot — it will be able to respond immediately.
|
||||
|
||||
## Step 4: Find Your Mattermost User ID
|
||||
|
||||
Hermes Agent uses your Mattermost User ID to control who can interact with the bot. To find it:
|
||||
|
||||
1. Click your **avatar** (top-left corner) → **Profile**.
|
||||
2. Your User ID is displayed in the profile dialog — click it to copy.
|
||||
|
||||
Your User ID is a 26-character alphanumeric string like `3uo8dkh1p7g1mfk49ear5fzs5c`.
|
||||
|
||||
:::warning
|
||||
Your User ID is **not** your username. The username is what appears after `@` (e.g., `@alice`). The User ID is a long alphanumeric identifier that Mattermost uses internally.
|
||||
:::
|
||||
|
||||
**Alternative**: You can also get your User ID via the API:
|
||||
|
||||
```bash
|
||||
curl -H "Authorization: Bearer YOUR_TOKEN" \
|
||||
https://your-mattermost-server/api/v4/users/me | jq .id
|
||||
```
|
||||
|
||||
:::tip
|
||||
To get a **Channel ID**: click the channel name → **View Info**. The Channel ID is shown in the info panel. You'll need this if you want to set a home channel manually.
|
||||
:::
|
||||
|
||||
## Step 5: Configure Hermes Agent
|
||||
|
||||
### Option A: Interactive Setup (Recommended)
|
||||
|
||||
Run the guided setup command:
|
||||
|
||||
```bash
|
||||
hermes gateway setup
|
||||
```
|
||||
|
||||
Select **Mattermost** when prompted, then paste your server URL, bot token, and user ID when asked.
|
||||
|
||||
### Option B: Manual Configuration
|
||||
|
||||
Add the following to your `~/.hermes/.env` file:
|
||||
|
||||
```bash
|
||||
# Required
|
||||
MATTERMOST_URL=https://mm.example.com
|
||||
MATTERMOST_TOKEN=***
|
||||
MATTERMOST_ALLOWED_USERS=3uo8dkh1p7g1mfk49ear5fzs5c
|
||||
|
||||
# Multiple allowed users (comma-separated)
|
||||
# MATTERMOST_ALLOWED_USERS=3uo8dkh1p7g1mfk49ear5fzs5c,8fk2jd9s0a7bncm1xqw4tp6r3e
|
||||
|
||||
# Optional: reply mode (thread or off, default: off)
|
||||
# MATTERMOST_REPLY_MODE=thread
|
||||
```
|
||||
|
||||
Optional behavior settings in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: true
|
||||
```
|
||||
|
||||
- `group_sessions_per_user: true` keeps each participant's context isolated inside shared channels and threads
|
||||
|
||||
### Start the Gateway
|
||||
|
||||
Once configured, start the Mattermost gateway:
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
The bot should connect to your Mattermost server within a few seconds. Send it a message — either a DM or in a channel where it's been added — to test.
|
||||
|
||||
:::tip
|
||||
You can run `hermes gateway` in the background or as a systemd service for persistent operation. See the deployment docs for details.
|
||||
:::
|
||||
|
||||
## Home Channel
|
||||
|
||||
You can designate a "home channel" where the bot sends proactive messages (such as cron job output, reminders, and notifications). There are two ways to set it:
|
||||
|
||||
### Using the Slash Command
|
||||
|
||||
Type `/sethome` in any Mattermost channel where the bot is present. That channel becomes the home channel.
|
||||
|
||||
### Manual Configuration
|
||||
|
||||
Add this to your `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
MATTERMOST_HOME_CHANNEL=abc123def456ghi789jkl012mn
|
||||
```
|
||||
|
||||
Replace the ID with the actual channel ID (click the channel name → View Info → copy the ID).
|
||||
|
||||
## Reply Mode
|
||||
|
||||
The `MATTERMOST_REPLY_MODE` setting controls how Hermes posts responses:
|
||||
|
||||
| Mode | Behavior |
|
||||
|------|----------|
|
||||
| `off` (default) | Hermes posts flat messages in the channel, like a normal user. |
|
||||
| `thread` | Hermes replies in a thread under your original message. Keeps channels clean when there's lots of back-and-forth. |
|
||||
|
||||
Set it in your `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
MATTERMOST_REPLY_MODE=thread
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Bot is not responding to messages
|
||||
|
||||
**Cause**: The bot is not a member of the channel, or `MATTERMOST_ALLOWED_USERS` doesn't include your User ID.
|
||||
|
||||
**Fix**: Add the bot to the channel (channel name → Add Members → search for the bot). Verify your User ID is in `MATTERMOST_ALLOWED_USERS`. Restart the gateway.
|
||||
|
||||
### 403 Forbidden errors
|
||||
|
||||
**Cause**: The bot token is invalid, or the bot doesn't have permission to post in the channel.
|
||||
|
||||
**Fix**: Check that `MATTERMOST_TOKEN` in your `.env` file is correct. Make sure the bot account hasn't been deactivated. Verify the bot has been added to the channel. If using a personal access token, ensure your account has the required permissions.
|
||||
|
||||
### WebSocket disconnects / reconnection loops
|
||||
|
||||
**Cause**: Network instability, Mattermost server restarts, or firewall/proxy issues with WebSocket connections.
|
||||
|
||||
**Fix**: The adapter automatically reconnects with exponential backoff (2s → 60s). Check your server's WebSocket configuration — reverse proxies (nginx, Apache) need WebSocket upgrade headers configured. Verify no firewall is blocking WebSocket connections on your Mattermost server.
|
||||
|
||||
For nginx, ensure your config includes:
|
||||
|
||||
```nginx
|
||||
location /api/v4/websocket {
|
||||
proxy_pass http://mattermost-backend;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
proxy_read_timeout 600s;
|
||||
}
|
||||
```
|
||||
|
||||
### "Failed to authenticate" on startup
|
||||
|
||||
**Cause**: The token or server URL is incorrect.
|
||||
|
||||
**Fix**: Verify `MATTERMOST_URL` points to your Mattermost server (include `https://`, no trailing slash). Check that `MATTERMOST_TOKEN` is valid — try it with curl:
|
||||
|
||||
```bash
|
||||
curl -H "Authorization: Bearer YOUR_TOKEN" \
|
||||
https://your-server/api/v4/users/me
|
||||
```
|
||||
|
||||
If this returns your bot's user info, the token is valid. If it returns an error, regenerate the token.
|
||||
|
||||
### Bot is offline
|
||||
|
||||
**Cause**: The Hermes gateway isn't running, or it failed to connect.
|
||||
|
||||
**Fix**: Check that `hermes gateway` is running. Look at the terminal output for error messages. Common issues: wrong URL, expired token, Mattermost server unreachable.
|
||||
|
||||
### "User not allowed" / Bot ignores you
|
||||
|
||||
**Cause**: Your User ID isn't in `MATTERMOST_ALLOWED_USERS`.
|
||||
|
||||
**Fix**: Add your User ID to `MATTERMOST_ALLOWED_USERS` in `~/.hermes/.env` and restart the gateway. Remember: the User ID is a 26-character alphanumeric string, not your `@username`.
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
Always set `MATTERMOST_ALLOWED_USERS` to restrict who can interact with the bot. Without it, the gateway denies all users by default as a safety measure. Only add User IDs of people you trust — authorized users have full access to the agent's capabilities, including tool use and system access.
|
||||
:::
|
||||
|
||||
For more information on securing your Hermes Agent deployment, see the [Security Guide](../security.md).
|
||||
|
||||
## Notes
|
||||
|
||||
- **Self-hosted friendly**: Works with any self-hosted Mattermost instance. No Mattermost Cloud account or subscription required.
|
||||
- **No extra dependencies**: The adapter uses `aiohttp` for HTTP and WebSocket, which is already included with Hermes Agent.
|
||||
- **Team Edition compatible**: Works with both Mattermost Team Edition (free) and Enterprise Edition.
|
||||
208
hermes_code/website/docs/user-guide/messaging/open-webui.md
Normal file
208
hermes_code/website/docs/user-guide/messaging/open-webui.md
Normal file
|
|
@ -0,0 +1,208 @@
|
|||
---
|
||||
sidebar_position: 8
|
||||
title: "Open WebUI"
|
||||
description: "Connect Open WebUI to Hermes Agent via the OpenAI-compatible API server"
|
||||
---
|
||||
|
||||
# Open WebUI Integration
|
||||
|
||||
[Open WebUI](https://github.com/open-webui/open-webui) (126k★) is the most popular self-hosted chat interface for AI. With Hermes Agent's built-in API server, you can use Open WebUI as a polished web frontend for your agent — complete with conversation management, user accounts, and a modern chat interface.
|
||||
|
||||
## Architecture
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A["Open WebUI<br/>browser UI<br/>port 3000"]
|
||||
B["hermes-agent<br/>gateway API server<br/>port 8642"]
|
||||
A -->|POST /v1/chat/completions| B
|
||||
B -->|SSE streaming response| A
|
||||
```
|
||||
|
||||
Open WebUI connects to Hermes Agent's API server just like it would connect to OpenAI. Your agent handles the requests with its full toolset — terminal, file operations, web search, memory, skills — and returns the final response.
|
||||
|
||||
Open WebUI talks to Hermes server-to-server, so you do not need `API_SERVER_CORS_ORIGINS` for this integration.
|
||||
|
||||
## Quick Setup
|
||||
|
||||
### 1. Enable the API server
|
||||
|
||||
Add to `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
API_SERVER_ENABLED=true
|
||||
API_SERVER_KEY=your-secret-key
|
||||
```
|
||||
|
||||
### 2. Start Hermes Agent gateway
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
You should see:
|
||||
|
||||
```
|
||||
[API Server] API server listening on http://127.0.0.1:8642
|
||||
```
|
||||
|
||||
### 3. Start Open WebUI
|
||||
|
||||
```bash
|
||||
docker run -d -p 3000:8080 \
|
||||
-e OPENAI_API_BASE_URL=http://host.docker.internal:8642/v1 \
|
||||
-e OPENAI_API_KEY=your-secret-key \
|
||||
--add-host=host.docker.internal:host-gateway \
|
||||
-v open-webui:/app/backend/data \
|
||||
--name open-webui \
|
||||
--restart always \
|
||||
ghcr.io/open-webui/open-webui:main
|
||||
```
|
||||
|
||||
### 4. Open the UI
|
||||
|
||||
Go to **http://localhost:3000**. Create your admin account (the first user becomes admin). You should see **hermes-agent** in the model dropdown. Start chatting!
|
||||
|
||||
## Docker Compose Setup
|
||||
|
||||
For a more permanent setup, create a `docker-compose.yml`:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
open-webui:
|
||||
image: ghcr.io/open-webui/open-webui:main
|
||||
ports:
|
||||
- "3000:8080"
|
||||
volumes:
|
||||
- open-webui:/app/backend/data
|
||||
environment:
|
||||
- OPENAI_API_BASE_URL=http://host.docker.internal:8642/v1
|
||||
- OPENAI_API_KEY=your-secret-key
|
||||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
restart: always
|
||||
|
||||
volumes:
|
||||
open-webui:
|
||||
```
|
||||
|
||||
Then:
|
||||
|
||||
```bash
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
## Configuring via the Admin UI
|
||||
|
||||
If you prefer to configure the connection through the UI instead of environment variables:
|
||||
|
||||
1. Log in to Open WebUI at **http://localhost:3000**
|
||||
2. Click your **profile avatar** → **Admin Settings**
|
||||
3. Go to **Connections**
|
||||
4. Under **OpenAI API**, click the **wrench icon** (Manage)
|
||||
5. Click **+ Add New Connection**
|
||||
6. Enter:
|
||||
- **URL**: `http://host.docker.internal:8642/v1`
|
||||
- **API Key**: your key or any non-empty value (e.g., `not-needed`)
|
||||
7. Click the **checkmark** to verify the connection
|
||||
8. **Save**
|
||||
|
||||
The **hermes-agent** model should now appear in the model dropdown.
|
||||
|
||||
:::warning
|
||||
Environment variables only take effect on Open WebUI's **first launch**. After that, connection settings are stored in its internal database. To change them later, use the Admin UI or delete the Docker volume and start fresh.
|
||||
:::
|
||||
|
||||
## API Type: Chat Completions vs Responses
|
||||
|
||||
Open WebUI supports two API modes when connecting to a backend:
|
||||
|
||||
| Mode | Format | When to use |
|
||||
|------|--------|-------------|
|
||||
| **Chat Completions** (default) | `/v1/chat/completions` | Recommended. Works out of the box. |
|
||||
| **Responses** (experimental) | `/v1/responses` | For server-side conversation state via `previous_response_id`. |
|
||||
|
||||
### Using Chat Completions (recommended)
|
||||
|
||||
This is the default and requires no extra configuration. Open WebUI sends standard OpenAI-format requests and Hermes Agent responds accordingly. Each request includes the full conversation history.
|
||||
|
||||
### Using Responses API
|
||||
|
||||
To use the Responses API mode:
|
||||
|
||||
1. Go to **Admin Settings** → **Connections** → **OpenAI** → **Manage**
|
||||
2. Edit your hermes-agent connection
|
||||
3. Change **API Type** from "Chat Completions" to **"Responses (Experimental)"**
|
||||
4. Save
|
||||
|
||||
With the Responses API, Open WebUI sends requests in the Responses format (`input` array + `instructions`), and Hermes Agent can preserve full tool call history across turns via `previous_response_id`.
|
||||
|
||||
:::note
|
||||
Open WebUI currently manages conversation history client-side even in Responses mode — it sends the full message history in each request rather than using `previous_response_id`. The Responses API mode is mainly useful for future compatibility as frontends evolve.
|
||||
:::
|
||||
|
||||
## How It Works
|
||||
|
||||
When you send a message in Open WebUI:
|
||||
|
||||
1. Open WebUI sends a `POST /v1/chat/completions` request with your message and conversation history
|
||||
2. Hermes Agent creates an AIAgent instance with its full toolset
|
||||
3. The agent processes your request — it may call tools (terminal, file operations, web search, etc.)
|
||||
4. Tool calls happen invisibly server-side
|
||||
5. The agent's final text response is returned to Open WebUI
|
||||
6. Open WebUI displays the response in its chat interface
|
||||
|
||||
Your agent has access to all the same tools and capabilities as when using the CLI or Telegram — the only difference is the frontend.
|
||||
|
||||
## Configuration Reference
|
||||
|
||||
### Hermes Agent (API server)
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `API_SERVER_ENABLED` | `false` | Enable the API server |
|
||||
| `API_SERVER_PORT` | `8642` | HTTP server port |
|
||||
| `API_SERVER_HOST` | `127.0.0.1` | Bind address |
|
||||
| `API_SERVER_KEY` | _(required)_ | Bearer token for auth. Match `OPENAI_API_KEY`. |
|
||||
|
||||
### Open WebUI
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `OPENAI_API_BASE_URL` | Hermes Agent's API URL (include `/v1`) |
|
||||
| `OPENAI_API_KEY` | Must be non-empty. Match your `API_SERVER_KEY`. |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### No models appear in the dropdown
|
||||
|
||||
- **Check the URL has `/v1` suffix**: `http://host.docker.internal:8642/v1` (not just `:8642`)
|
||||
- **Verify the gateway is running**: `curl http://localhost:8642/health` should return `{"status": "ok"}`
|
||||
- **Check model listing**: `curl http://localhost:8642/v1/models` should return a list with `hermes-agent`
|
||||
- **Docker networking**: From inside Docker, `localhost` means the container, not your host. Use `host.docker.internal` or `--network=host`.
|
||||
|
||||
### Connection test passes but no models load
|
||||
|
||||
This is almost always the missing `/v1` suffix. Open WebUI's connection test is a basic connectivity check — it doesn't verify model listing works.
|
||||
|
||||
### Response takes a long time
|
||||
|
||||
Hermes Agent may be executing multiple tool calls (reading files, running commands, searching the web) before producing its final response. This is normal for complex queries. The response appears all at once when the agent finishes.
|
||||
|
||||
### "Invalid API key" errors
|
||||
|
||||
Make sure your `OPENAI_API_KEY` in Open WebUI matches the `API_SERVER_KEY` in Hermes Agent.
|
||||
|
||||
## Linux Docker (no Docker Desktop)
|
||||
|
||||
On Linux without Docker Desktop, `host.docker.internal` doesn't resolve by default. Options:
|
||||
|
||||
```bash
|
||||
# Option 1: Add host mapping
|
||||
docker run --add-host=host.docker.internal:host-gateway ...
|
||||
|
||||
# Option 2: Use host networking
|
||||
docker run --network=host -e OPENAI_API_BASE_URL=http://localhost:8642/v1 ...
|
||||
|
||||
# Option 3: Use Docker bridge IP
|
||||
docker run -e OPENAI_API_BASE_URL=http://172.17.0.1:8642/v1 ...
|
||||
```
|
||||
238
hermes_code/website/docs/user-guide/messaging/signal.md
Normal file
238
hermes_code/website/docs/user-guide/messaging/signal.md
Normal file
|
|
@ -0,0 +1,238 @@
|
|||
---
|
||||
sidebar_position: 6
|
||||
title: "Signal"
|
||||
description: "Set up Hermes Agent as a Signal messenger bot via signal-cli daemon"
|
||||
---
|
||||
|
||||
# Signal Setup
|
||||
|
||||
Hermes connects to Signal through the [signal-cli](https://github.com/AsamK/signal-cli) daemon running in HTTP mode. The adapter streams messages in real-time via SSE (Server-Sent Events) and sends responses via JSON-RPC.
|
||||
|
||||
Signal is the most privacy-focused mainstream messenger — end-to-end encrypted by default, open-source protocol, minimal metadata collection. This makes it ideal for security-sensitive agent workflows.
|
||||
|
||||
:::info No New Python Dependencies
|
||||
The Signal adapter uses `httpx` (already a core Hermes dependency) for all communication. No additional Python packages are required. You just need signal-cli installed externally.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **signal-cli** — Java-based Signal client ([GitHub](https://github.com/AsamK/signal-cli))
|
||||
- **Java 17+** runtime — required by signal-cli
|
||||
- **A phone number** with Signal installed (for linking as a secondary device)
|
||||
|
||||
### Installing signal-cli
|
||||
|
||||
```bash
|
||||
# Linux (Debian/Ubuntu)
|
||||
sudo apt install signal-cli
|
||||
|
||||
# macOS
|
||||
brew install signal-cli
|
||||
|
||||
# Manual install (any platform)
|
||||
# Download from https://github.com/AsamK/signal-cli/releases
|
||||
# Extract and add to PATH
|
||||
```
|
||||
|
||||
### Alternative: Docker (signal-cli-rest-api)
|
||||
|
||||
If you prefer Docker, use the [signal-cli-rest-api](https://github.com/bbernhard/signal-cli-rest-api) container:
|
||||
|
||||
```bash
|
||||
docker run -d --name signal-cli \
|
||||
-p 8080:8080 \
|
||||
-v $HOME/.local/share/signal-cli:/home/.local/share/signal-cli \
|
||||
-e MODE=json-rpc \
|
||||
bbernhard/signal-cli-rest-api
|
||||
```
|
||||
|
||||
:::tip
|
||||
Use `MODE=json-rpc` for best performance. The `normal` mode spawns a JVM per request and is much slower.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Link Your Signal Account
|
||||
|
||||
Signal-cli works as a **linked device** — like WhatsApp Web, but for Signal. Your phone stays the primary device.
|
||||
|
||||
```bash
|
||||
# Generate a linking URI (displays a QR code or link)
|
||||
signal-cli link -n "HermesAgent"
|
||||
```
|
||||
|
||||
1. Open **Signal** on your phone
|
||||
2. Go to **Settings → Linked Devices**
|
||||
3. Tap **Link New Device**
|
||||
4. Scan the QR code or enter the URI
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Start the signal-cli Daemon
|
||||
|
||||
```bash
|
||||
# Replace +1234567890 with your Signal phone number (E.164 format)
|
||||
signal-cli --account +1234567890 daemon --http 127.0.0.1:8080
|
||||
```
|
||||
|
||||
:::tip
|
||||
Keep this running in the background. You can use `systemd`, `tmux`, `screen`, or run it as a service.
|
||||
:::
|
||||
|
||||
Verify it's running:
|
||||
|
||||
```bash
|
||||
curl http://127.0.0.1:8080/api/v1/check
|
||||
# Should return: {"versions":{"signal-cli":...}}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Configure Hermes
|
||||
|
||||
The easiest way:
|
||||
|
||||
```bash
|
||||
hermes gateway setup
|
||||
```
|
||||
|
||||
Select **Signal** from the platform menu. The wizard will:
|
||||
|
||||
1. Check if signal-cli is installed
|
||||
2. Prompt for the HTTP URL (default: `http://127.0.0.1:8080`)
|
||||
3. Test connectivity to the daemon
|
||||
4. Ask for your account phone number
|
||||
5. Configure allowed users and access policies
|
||||
|
||||
### Manual Configuration
|
||||
|
||||
Add to `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
# Required
|
||||
SIGNAL_HTTP_URL=http://127.0.0.1:8080
|
||||
SIGNAL_ACCOUNT=+1234567890
|
||||
|
||||
# Security (recommended)
|
||||
SIGNAL_ALLOWED_USERS=+1234567890,+0987654321 # Comma-separated E.164 numbers or UUIDs
|
||||
|
||||
# Optional
|
||||
SIGNAL_GROUP_ALLOWED_USERS=groupId1,groupId2 # Enable groups (omit to disable, * for all)
|
||||
SIGNAL_HOME_CHANNEL=+1234567890 # Default delivery target for cron jobs
|
||||
```
|
||||
|
||||
Then start the gateway:
|
||||
|
||||
```bash
|
||||
hermes gateway # Foreground
|
||||
hermes gateway install # Install as a user service
|
||||
sudo hermes gateway install --system # Linux only: boot-time system service
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Access Control
|
||||
|
||||
### DM Access
|
||||
|
||||
DM access follows the same pattern as all other Hermes platforms:
|
||||
|
||||
1. **`SIGNAL_ALLOWED_USERS` set** → only those users can message
|
||||
2. **No allowlist set** → unknown users get a DM pairing code (approve via `hermes pairing approve signal CODE`)
|
||||
3. **`SIGNAL_ALLOW_ALL_USERS=true`** → anyone can message (use with caution)
|
||||
|
||||
### Group Access
|
||||
|
||||
Group access is controlled by the `SIGNAL_GROUP_ALLOWED_USERS` env var:
|
||||
|
||||
| Configuration | Behavior |
|
||||
|---------------|----------|
|
||||
| Not set (default) | All group messages are ignored. The bot only responds to DMs. |
|
||||
| Set with group IDs | Only listed groups are monitored (e.g., `groupId1,groupId2`). |
|
||||
| Set to `*` | The bot responds in any group it's a member of. |
|
||||
|
||||
---
|
||||
|
||||
## Features
|
||||
|
||||
### Attachments
|
||||
|
||||
The adapter supports sending and receiving:
|
||||
|
||||
- **Images** — PNG, JPEG, GIF, WebP (auto-detected via magic bytes)
|
||||
- **Audio** — MP3, OGG, WAV, M4A (voice messages transcribed if Whisper is configured)
|
||||
- **Documents** — PDF, ZIP, and other file types
|
||||
|
||||
Attachment size limit: **100 MB**.
|
||||
|
||||
### Typing Indicators
|
||||
|
||||
The bot sends typing indicators while processing messages, refreshing every 8 seconds.
|
||||
|
||||
### Phone Number Redaction
|
||||
|
||||
All phone numbers are automatically redacted in logs:
|
||||
- `+15551234567` → `+155****4567`
|
||||
- This applies to both Hermes gateway logs and the global redaction system
|
||||
|
||||
### Note to Self (Single-Number Setup)
|
||||
|
||||
If you run signal-cli as a **linked secondary device** on your own phone number (rather than a separate bot number), you can interact with Hermes through Signal's "Note to Self" feature.
|
||||
|
||||
Just send a message to yourself from your phone — signal-cli picks it up and Hermes responds in the same conversation.
|
||||
|
||||
**How it works:**
|
||||
- "Note to Self" messages arrive as `syncMessage.sentMessage` envelopes
|
||||
- The adapter detects when these are addressed to the bot's own account and processes them as regular inbound messages
|
||||
- Echo-back protection (sent-timestamp tracking) prevents infinite loops — the bot's own replies are filtered out automatically
|
||||
|
||||
**No extra configuration needed.** This works automatically as long as `SIGNAL_ACCOUNT` matches your phone number.
|
||||
|
||||
### Health Monitoring
|
||||
|
||||
The adapter monitors the SSE connection and automatically reconnects if:
|
||||
- The connection drops (with exponential backoff: 2s → 60s)
|
||||
- No activity is detected for 120 seconds (pings signal-cli to verify)
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Solution |
|
||||
|---------|----------|
|
||||
| **"Cannot reach signal-cli"** during setup | Ensure signal-cli daemon is running: `signal-cli --account +YOUR_NUMBER daemon --http 127.0.0.1:8080` |
|
||||
| **Messages not received** | Check that `SIGNAL_ALLOWED_USERS` includes the sender's number in E.164 format (with `+` prefix) |
|
||||
| **"signal-cli not found on PATH"** | Install signal-cli and ensure it's in your PATH, or use Docker |
|
||||
| **Connection keeps dropping** | Check signal-cli logs for errors. Ensure Java 17+ is installed. |
|
||||
| **Group messages ignored** | Configure `SIGNAL_GROUP_ALLOWED_USERS` with specific group IDs, or `*` to allow all groups. |
|
||||
| **Bot responds to no one** | Configure `SIGNAL_ALLOWED_USERS`, use DM pairing, or explicitly allow all users through gateway policy if you want broader access. |
|
||||
| **Duplicate messages** | Ensure only one signal-cli instance is listening on your phone number |
|
||||
|
||||
---
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
**Always configure access controls.** The bot has terminal access by default. Without `SIGNAL_ALLOWED_USERS` or DM pairing, the gateway denies all incoming messages as a safety measure.
|
||||
:::
|
||||
|
||||
- Phone numbers are redacted in all log output
|
||||
- Use DM pairing or explicit allowlists for safe onboarding of new users
|
||||
- Keep groups disabled unless you specifically need group support, or allowlist only the groups you trust
|
||||
- Signal's end-to-end encryption protects message content in transit
|
||||
- The signal-cli session data in `~/.local/share/signal-cli/` contains account credentials — protect it like a password
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables Reference
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|----------|----------|---------|-------------|
|
||||
| `SIGNAL_HTTP_URL` | Yes | — | signal-cli HTTP endpoint |
|
||||
| `SIGNAL_ACCOUNT` | Yes | — | Bot phone number (E.164) |
|
||||
| `SIGNAL_ALLOWED_USERS` | No | — | Comma-separated phone numbers/UUIDs |
|
||||
| `SIGNAL_GROUP_ALLOWED_USERS` | No | — | Group IDs to monitor, or `*` for all (omit to disable groups) |
|
||||
| `SIGNAL_ALLOW_ALL_USERS` | No | `false` | Allow any user to interact (skip allowlist) |
|
||||
| `SIGNAL_HOME_CHANNEL` | No | — | Default delivery target for cron jobs |
|
||||
274
hermes_code/website/docs/user-guide/messaging/slack.md
Normal file
274
hermes_code/website/docs/user-guide/messaging/slack.md
Normal file
|
|
@ -0,0 +1,274 @@
|
|||
---
|
||||
sidebar_position: 4
|
||||
title: "Slack"
|
||||
description: "Set up Hermes Agent as a Slack bot using Socket Mode"
|
||||
---
|
||||
|
||||
# Slack Setup
|
||||
|
||||
Connect Hermes Agent to Slack as a bot using Socket Mode. Socket Mode uses WebSockets instead of
|
||||
public HTTP endpoints, so your Hermes instance doesn't need to be publicly accessible — it works
|
||||
behind firewalls, on your laptop, or on a private server.
|
||||
|
||||
:::warning Classic Slack Apps Deprecated
|
||||
Classic Slack apps (using RTM API) were **fully deprecated in March 2025**. Hermes uses the modern
|
||||
Bolt SDK with Socket Mode. If you have an old classic app, you must create a new one following
|
||||
the steps below.
|
||||
:::
|
||||
|
||||
## Overview
|
||||
|
||||
| Component | Value |
|
||||
|-----------|-------|
|
||||
| **Library** | `slack-bolt` / `slack_sdk` for Python (Socket Mode) |
|
||||
| **Connection** | WebSocket — no public URL required |
|
||||
| **Auth tokens needed** | Bot Token (`xoxb-`) + App-Level Token (`xapp-`) |
|
||||
| **User identification** | Slack Member IDs (e.g., `U01ABC2DEF3`) |
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Create a Slack App
|
||||
|
||||
1. Go to [https://api.slack.com/apps](https://api.slack.com/apps)
|
||||
2. Click **Create New App**
|
||||
3. Choose **From scratch**
|
||||
4. Enter an app name (e.g., "Hermes Agent") and select your workspace
|
||||
5. Click **Create App**
|
||||
|
||||
You'll land on the app's **Basic Information** page.
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Configure Bot Token Scopes
|
||||
|
||||
Navigate to **Features → OAuth & Permissions** in the sidebar. Scroll to **Scopes → Bot Token Scopes** and add the following:
|
||||
|
||||
| Scope | Purpose |
|
||||
|-------|---------|
|
||||
| `chat:write` | Send messages as the bot |
|
||||
| `app_mentions:read` | Detect when @mentioned in channels |
|
||||
| `channels:history` | Read messages in public channels the bot is in |
|
||||
| `channels:read` | List and get info about public channels |
|
||||
| `groups:history` | Read messages in private channels the bot is invited to |
|
||||
| `im:history` | Read direct message history |
|
||||
| `im:read` | View basic DM info |
|
||||
| `im:write` | Open and manage DMs |
|
||||
| `users:read` | Look up user information |
|
||||
| `files:write` | Upload files (images, audio, documents) |
|
||||
|
||||
:::caution Missing scopes = missing features
|
||||
Without `channels:history` and `groups:history`, the bot **will not receive messages in channels** —
|
||||
it will only work in DMs. These are the most commonly missed scopes.
|
||||
:::
|
||||
|
||||
**Optional scopes:**
|
||||
|
||||
| Scope | Purpose |
|
||||
|-------|---------|
|
||||
| `groups:read` | List and get info about private channels |
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Enable Socket Mode
|
||||
|
||||
Socket Mode lets the bot connect via WebSocket instead of requiring a public URL.
|
||||
|
||||
1. In the sidebar, go to **Settings → Socket Mode**
|
||||
2. Toggle **Enable Socket Mode** to ON
|
||||
3. You'll be prompted to create an **App-Level Token**:
|
||||
- Name it something like `hermes-socket` (the name doesn't matter)
|
||||
- Add the **`connections:write`** scope
|
||||
- Click **Generate**
|
||||
4. **Copy the token** — it starts with `xapp-`. This is your `SLACK_APP_TOKEN`
|
||||
|
||||
:::tip
|
||||
You can always find or regenerate app-level tokens under **Settings → Basic Information → App-Level Tokens**.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Subscribe to Events
|
||||
|
||||
This step is critical — it controls what messages the bot can see.
|
||||
|
||||
|
||||
1. In the sidebar, go to **Features → Event Subscriptions**
|
||||
2. Toggle **Enable Events** to ON
|
||||
3. Expand **Subscribe to bot events** and add:
|
||||
|
||||
| Event | Required? | Purpose |
|
||||
|-------|-----------|---------|
|
||||
| `message.im` | **Yes** | Bot receives direct messages |
|
||||
| `message.channels` | **Yes** | Bot receives messages in **public** channels it's added to |
|
||||
| `message.groups` | **Recommended** | Bot receives messages in **private** channels it's invited to |
|
||||
| `app_mention` | **Yes** | Prevents Bolt SDK errors when bot is @mentioned |
|
||||
|
||||
4. Click **Save Changes** at the bottom of the page
|
||||
|
||||
:::danger Missing event subscriptions is the #1 setup issue
|
||||
If the bot works in DMs but **not in channels**, you almost certainly forgot to add
|
||||
`message.channels` (for public channels) and/or `message.groups` (for private channels).
|
||||
Without these events, Slack simply never delivers channel messages to the bot.
|
||||
:::
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Install App to Workspace
|
||||
|
||||
1. In the sidebar, go to **Settings → Install App**
|
||||
2. Click **Install to Workspace**
|
||||
3. Review the permissions and click **Allow**
|
||||
4. After authorization, you'll see a **Bot User OAuth Token** starting with `xoxb-`
|
||||
5. **Copy this token** — this is your `SLACK_BOT_TOKEN`
|
||||
|
||||
:::tip
|
||||
If you change scopes or event subscriptions later, you **must reinstall the app** for the changes
|
||||
to take effect. The Install App page will show a banner prompting you to do so.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Find User IDs for the Allowlist
|
||||
|
||||
Hermes uses Slack **Member IDs** (not usernames or display names) for the allowlist.
|
||||
|
||||
To find a Member ID:
|
||||
|
||||
1. In Slack, click on the user's name or avatar
|
||||
2. Click **View full profile**
|
||||
3. Click the **⋮** (more) button
|
||||
4. Select **Copy member ID**
|
||||
|
||||
Member IDs look like `U01ABC2DEF3`. You need your own Member ID at minimum.
|
||||
|
||||
---
|
||||
|
||||
## Step 7: Configure Hermes
|
||||
|
||||
Add the following to your `~/.hermes/.env` file:
|
||||
|
||||
```bash
|
||||
# Required
|
||||
SLACK_BOT_TOKEN=xoxb-your-bot-token-here
|
||||
SLACK_APP_TOKEN=xapp-your-app-token-here
|
||||
SLACK_ALLOWED_USERS=U01ABC2DEF3 # Comma-separated Member IDs
|
||||
|
||||
# Optional
|
||||
SLACK_HOME_CHANNEL=C01234567890 # Default channel for cron/scheduled messages
|
||||
SLACK_HOME_CHANNEL_NAME=general # Human-readable name for the home channel (optional)
|
||||
```
|
||||
|
||||
Or run the interactive setup:
|
||||
|
||||
```bash
|
||||
hermes gateway setup # Select Slack when prompted
|
||||
```
|
||||
|
||||
Then start the gateway:
|
||||
|
||||
```bash
|
||||
hermes gateway # Foreground
|
||||
hermes gateway install # Install as a user service
|
||||
sudo hermes gateway install --system # Linux only: boot-time system service
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 8: Invite the Bot to Channels
|
||||
|
||||
After starting the gateway, you need to **invite the bot** to any channel where you want it to respond:
|
||||
|
||||
```
|
||||
/invite @Hermes Agent
|
||||
```
|
||||
|
||||
The bot will **not** automatically join channels. You must invite it to each channel individually.
|
||||
|
||||
---
|
||||
|
||||
## How the Bot Responds
|
||||
|
||||
Understanding how Hermes behaves in different contexts:
|
||||
|
||||
| Context | Behavior |
|
||||
|---------|----------|
|
||||
| **DMs** | Bot responds to every message — no @mention needed |
|
||||
| **Channels** | Bot **only responds when @mentioned** (e.g., `@Hermes Agent what time is it?`). In channels, Hermes replies in a thread attached to that message. |
|
||||
| **Threads** | If you @mention Hermes inside an existing thread, it replies in that same thread. |
|
||||
|
||||
:::tip
|
||||
In channels, always @mention the bot. Simply typing a message without mentioning it will be ignored.
|
||||
This is intentional — it prevents the bot from responding to every message in busy channels.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Home Channel
|
||||
|
||||
Set `SLACK_HOME_CHANNEL` to a channel ID where Hermes will deliver scheduled messages,
|
||||
cron job results, and other proactive notifications. To find a channel ID:
|
||||
|
||||
1. Right-click the channel name in Slack
|
||||
2. Click **View channel details**
|
||||
3. Scroll to the bottom — the Channel ID is shown there
|
||||
|
||||
```bash
|
||||
SLACK_HOME_CHANNEL=C01234567890
|
||||
```
|
||||
|
||||
Make sure the bot has been **invited to the channel** (`/invite @Hermes Agent`).
|
||||
|
||||
---
|
||||
|
||||
## Voice Messages
|
||||
|
||||
Hermes supports voice on Slack:
|
||||
|
||||
- **Incoming:** Voice/audio messages are automatically transcribed using the configured STT provider: local `faster-whisper`, Groq Whisper (`GROQ_API_KEY`), or OpenAI Whisper (`VOICE_TOOLS_OPENAI_KEY`)
|
||||
- **Outgoing:** TTS responses are sent as audio file attachments
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Solution |
|
||||
|---------|----------|
|
||||
| Bot doesn't respond to DMs | Verify `message.im` is in your event subscriptions and the app is reinstalled |
|
||||
| Bot works in DMs but not in channels | **Most common issue.** Add `message.channels` and `message.groups` to event subscriptions, reinstall the app, and invite the bot to the channel with `/invite @Hermes Agent` |
|
||||
| Bot doesn't respond to @mentions in channels | 1) Check `message.channels` event is subscribed. 2) Bot must be invited to the channel. 3) Ensure `channels:history` scope is added. 4) Reinstall the app after scope/event changes |
|
||||
| Bot ignores messages in private channels | Add both the `message.groups` event subscription and `groups:history` scope, then reinstall the app and `/invite` the bot |
|
||||
| "not_authed" or "invalid_auth" errors | Regenerate your Bot Token and App Token, update `.env` |
|
||||
| Bot responds but can't post in a channel | Invite the bot to the channel with `/invite @Hermes Agent` |
|
||||
| "missing_scope" error | Add the required scope in OAuth & Permissions, then **reinstall** the app |
|
||||
| Socket disconnects frequently | Check your network; Bolt auto-reconnects but unstable connections cause lag |
|
||||
| Changed scopes/events but nothing changed | You **must reinstall** the app to your workspace after any scope or event subscription change |
|
||||
|
||||
### Quick Checklist
|
||||
|
||||
If the bot isn't working in channels, verify **all** of the following:
|
||||
|
||||
1. ✅ `message.channels` event is subscribed (for public channels)
|
||||
2. ✅ `message.groups` event is subscribed (for private channels)
|
||||
3. ✅ `app_mention` event is subscribed
|
||||
4. ✅ `channels:history` scope is added (for public channels)
|
||||
5. ✅ `groups:history` scope is added (for private channels)
|
||||
6. ✅ App was **reinstalled** after adding scopes/events
|
||||
7. ✅ Bot was **invited** to the channel (`/invite @Hermes Agent`)
|
||||
8. ✅ You are **@mentioning** the bot in your message
|
||||
|
||||
---
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
**Always set `SLACK_ALLOWED_USERS`** with the Member IDs of authorized users. Without this setting,
|
||||
the gateway will **deny all messages** by default as a safety measure. Never share your bot tokens —
|
||||
treat them like passwords.
|
||||
:::
|
||||
|
||||
- Tokens should be stored in `~/.hermes/.env` (file permissions `600`)
|
||||
- Rotate tokens periodically via the Slack app settings
|
||||
- Audit who has access to your Hermes config directory
|
||||
- Socket Mode means no public endpoint is exposed — one less attack surface
|
||||
175
hermes_code/website/docs/user-guide/messaging/sms.md
Normal file
175
hermes_code/website/docs/user-guide/messaging/sms.md
Normal file
|
|
@ -0,0 +1,175 @@
|
|||
---
|
||||
sidebar_position: 8
|
||||
title: "SMS (Twilio)"
|
||||
description: "Set up Hermes Agent as an SMS chatbot via Twilio"
|
||||
---
|
||||
|
||||
# SMS Setup (Twilio)
|
||||
|
||||
Hermes connects to SMS through the [Twilio](https://www.twilio.com/) API. People text your Twilio phone number and get AI responses back — same conversational experience as Telegram or Discord, but over standard text messages.
|
||||
|
||||
:::info Shared Credentials
|
||||
The SMS gateway shares credentials with the optional [telephony skill](/docs/reference/skills-catalog). If you've already set up Twilio for voice calls or one-off SMS, the gateway works with the same `TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, and `TWILIO_PHONE_NUMBER`.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Twilio account** — [Sign up at twilio.com](https://www.twilio.com/try-twilio) (free trial available)
|
||||
- **A Twilio phone number** with SMS capability
|
||||
- **A publicly accessible server** — Twilio sends webhooks to your server when SMS arrives
|
||||
- **aiohttp** — `pip install 'hermes-agent[sms]'`
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Get Your Twilio Credentials
|
||||
|
||||
1. Go to the [Twilio Console](https://console.twilio.com/)
|
||||
2. Copy your **Account SID** and **Auth Token** from the dashboard
|
||||
3. Go to **Phone Numbers → Manage → Active Numbers** — note your phone number in E.164 format (e.g., `+15551234567`)
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Configure Hermes
|
||||
|
||||
### Interactive setup (recommended)
|
||||
|
||||
```bash
|
||||
hermes gateway setup
|
||||
```
|
||||
|
||||
Select **SMS (Twilio)** from the platform list. The wizard will prompt for your credentials.
|
||||
|
||||
### Manual setup
|
||||
|
||||
Add to `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||
TWILIO_AUTH_TOKEN=your_auth_token_here
|
||||
TWILIO_PHONE_NUMBER=+15551234567
|
||||
|
||||
# Security: restrict to specific phone numbers (recommended)
|
||||
SMS_ALLOWED_USERS=+15559876543,+15551112222
|
||||
|
||||
# Optional: set a home channel for cron job delivery
|
||||
SMS_HOME_CHANNEL=+15559876543
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Configure Twilio Webhook
|
||||
|
||||
Twilio needs to know where to send incoming messages. In the [Twilio Console](https://console.twilio.com/):
|
||||
|
||||
1. Go to **Phone Numbers → Manage → Active Numbers**
|
||||
2. Click your phone number
|
||||
3. Under **Messaging → A MESSAGE COMES IN**, set:
|
||||
- **Webhook**: `https://your-server:8080/webhooks/twilio`
|
||||
- **HTTP Method**: `POST`
|
||||
|
||||
:::tip Exposing Your Webhook
|
||||
If you're running Hermes locally, use a tunnel to expose the webhook:
|
||||
|
||||
```bash
|
||||
# Using cloudflared
|
||||
cloudflared tunnel --url http://localhost:8080
|
||||
|
||||
# Using ngrok
|
||||
ngrok http 8080
|
||||
```
|
||||
|
||||
Set the resulting public URL as your Twilio webhook.
|
||||
:::
|
||||
|
||||
The webhook port defaults to `8080`. Override with:
|
||||
|
||||
```bash
|
||||
SMS_WEBHOOK_PORT=3000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Start the Gateway
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
You should see:
|
||||
|
||||
```
|
||||
[sms] Twilio webhook server listening on port 8080, from: +1555***4567
|
||||
```
|
||||
|
||||
Text your Twilio number — Hermes will respond via SMS.
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `TWILIO_ACCOUNT_SID` | Yes | Twilio Account SID (starts with `AC`) |
|
||||
| `TWILIO_AUTH_TOKEN` | Yes | Twilio Auth Token |
|
||||
| `TWILIO_PHONE_NUMBER` | Yes | Your Twilio phone number (E.164 format) |
|
||||
| `SMS_WEBHOOK_PORT` | No | Webhook listener port (default: `8080`) |
|
||||
| `SMS_ALLOWED_USERS` | No | Comma-separated E.164 phone numbers allowed to chat |
|
||||
| `SMS_ALLOW_ALL_USERS` | No | Set to `true` to allow anyone (not recommended) |
|
||||
| `SMS_HOME_CHANNEL` | No | Phone number for cron job / notification delivery |
|
||||
| `SMS_HOME_CHANNEL_NAME` | No | Display name for the home channel (default: `Home`) |
|
||||
|
||||
---
|
||||
|
||||
## SMS-Specific Behavior
|
||||
|
||||
- **Plain text only** — Markdown is automatically stripped since SMS renders it as literal characters
|
||||
- **1600 character limit** — Longer responses are split across multiple messages at natural boundaries (newlines, then spaces)
|
||||
- **Echo prevention** — Messages from your own Twilio number are ignored to prevent loops
|
||||
- **Phone number redaction** — Phone numbers are redacted in logs for privacy
|
||||
|
||||
---
|
||||
|
||||
## Security
|
||||
|
||||
**The gateway denies all users by default.** Configure an allowlist:
|
||||
|
||||
```bash
|
||||
# Recommended: restrict to specific phone numbers
|
||||
SMS_ALLOWED_USERS=+15559876543,+15551112222
|
||||
|
||||
# Or allow all (NOT recommended for bots with terminal access)
|
||||
SMS_ALLOW_ALL_USERS=true
|
||||
```
|
||||
|
||||
:::warning
|
||||
SMS has no built-in encryption. Don't use SMS for sensitive operations unless you understand the security implications. For sensitive use cases, prefer Signal or Telegram.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Messages not arriving
|
||||
|
||||
1. Check your Twilio webhook URL is correct and publicly accessible
|
||||
2. Verify `TWILIO_ACCOUNT_SID` and `TWILIO_AUTH_TOKEN` are correct
|
||||
3. Check the Twilio Console → **Monitor → Logs → Messaging** for delivery errors
|
||||
4. Ensure your phone number is in `SMS_ALLOWED_USERS` (or `SMS_ALLOW_ALL_USERS=true`)
|
||||
|
||||
### Replies not sending
|
||||
|
||||
1. Check `TWILIO_PHONE_NUMBER` is set correctly (E.164 format with `+`)
|
||||
2. Verify your Twilio account has SMS-capable numbers
|
||||
3. Check Hermes gateway logs for Twilio API errors
|
||||
|
||||
### Webhook port conflicts
|
||||
|
||||
If port 8080 is already in use, change it:
|
||||
|
||||
```bash
|
||||
SMS_WEBHOOK_PORT=3001
|
||||
```
|
||||
|
||||
Update the webhook URL in Twilio Console to match.
|
||||
200
hermes_code/website/docs/user-guide/messaging/telegram.md
Normal file
200
hermes_code/website/docs/user-guide/messaging/telegram.md
Normal file
|
|
@ -0,0 +1,200 @@
|
|||
---
|
||||
sidebar_position: 1
|
||||
title: "Telegram"
|
||||
description: "Set up Hermes Agent as a Telegram bot"
|
||||
---
|
||||
|
||||
# Telegram Setup
|
||||
|
||||
Hermes Agent integrates with Telegram as a full-featured conversational bot. Once connected, you can chat with your agent from any device, send voice memos that get auto-transcribed, receive scheduled task results, and use the agent in group chats. The integration is built on [python-telegram-bot](https://python-telegram-bot.org/) and supports text, voice, images, and file attachments.
|
||||
|
||||
## Step 1: Create a Bot via BotFather
|
||||
|
||||
Every Telegram bot requires an API token issued by [@BotFather](https://t.me/BotFather), Telegram's official bot management tool.
|
||||
|
||||
1. Open Telegram and search for **@BotFather**, or visit [t.me/BotFather](https://t.me/BotFather)
|
||||
2. Send `/newbot`
|
||||
3. Choose a **display name** (e.g., "Hermes Agent") — this can be anything
|
||||
4. Choose a **username** — this must be unique and end in `bot` (e.g., `my_hermes_bot`)
|
||||
5. BotFather replies with your **API token**. It looks like this:
|
||||
|
||||
```
|
||||
123456789:ABCdefGHIjklMNOpqrSTUvwxYZ
|
||||
```
|
||||
|
||||
:::warning
|
||||
Keep your bot token secret. Anyone with this token can control your bot. If it leaks, revoke it immediately via `/revoke` in BotFather.
|
||||
:::
|
||||
|
||||
## Step 2: Customize Your Bot (Optional)
|
||||
|
||||
These BotFather commands improve the user experience. Message @BotFather and use:
|
||||
|
||||
| Command | Purpose |
|
||||
|---------|---------|
|
||||
| `/setdescription` | The "What can this bot do?" text shown before a user starts chatting |
|
||||
| `/setabouttext` | Short text on the bot's profile page |
|
||||
| `/setuserpic` | Upload an avatar for your bot |
|
||||
| `/setcommands` | Define the command menu (the `/` button in chat) |
|
||||
| `/setprivacy` | Control whether the bot sees all group messages (see Step 3) |
|
||||
|
||||
:::tip
|
||||
For `/setcommands`, a useful starting set:
|
||||
|
||||
```
|
||||
help - Show help information
|
||||
new - Start a new conversation
|
||||
sethome - Set this chat as the home channel
|
||||
```
|
||||
:::
|
||||
|
||||
## Step 3: Privacy Mode (Critical for Groups)
|
||||
|
||||
Telegram bots have a **privacy mode** that is **enabled by default**. This is the single most common source of confusion when using bots in groups.
|
||||
|
||||
**With privacy mode ON**, your bot can only see:
|
||||
- Messages that start with a `/` command
|
||||
- Replies directly to the bot's own messages
|
||||
- Service messages (member joins/leaves, pinned messages, etc.)
|
||||
- Messages in channels where the bot is an admin
|
||||
|
||||
**With privacy mode OFF**, the bot receives every message in the group.
|
||||
|
||||
### How to disable privacy mode
|
||||
|
||||
1. Message **@BotFather**
|
||||
2. Send `/mybots`
|
||||
3. Select your bot
|
||||
4. Go to **Bot Settings → Group Privacy → Turn off**
|
||||
|
||||
:::warning
|
||||
**You must remove and re-add the bot to any group** after changing the privacy setting. Telegram caches the privacy state when a bot joins a group, and it will not update until the bot is removed and re-added.
|
||||
:::
|
||||
|
||||
:::tip
|
||||
An alternative to disabling privacy mode: promote the bot to **group admin**. Admin bots always receive all messages regardless of the privacy setting, and this avoids needing to toggle the global privacy mode.
|
||||
:::
|
||||
|
||||
## Step 4: Find Your User ID
|
||||
|
||||
Hermes Agent uses numeric Telegram user IDs to control access. Your user ID is **not** your username — it's a number like `123456789`.
|
||||
|
||||
**Method 1 (recommended):** Message [@userinfobot](https://t.me/userinfobot) — it instantly replies with your user ID.
|
||||
|
||||
**Method 2:** Message [@get_id_bot](https://t.me/get_id_bot) — another reliable option.
|
||||
|
||||
Save this number; you'll need it for the next step.
|
||||
|
||||
## Step 5: Configure Hermes
|
||||
|
||||
### Option A: Interactive Setup (Recommended)
|
||||
|
||||
```bash
|
||||
hermes gateway setup
|
||||
```
|
||||
|
||||
Select **Telegram** when prompted. The wizard asks for your bot token and allowed user IDs, then writes the configuration for you.
|
||||
|
||||
### Option B: Manual Configuration
|
||||
|
||||
Add the following to `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
TELEGRAM_BOT_TOKEN=123456789:ABCdefGHIjklMNOpqrSTUvwxYZ
|
||||
TELEGRAM_ALLOWED_USERS=123456789 # Comma-separated for multiple users
|
||||
```
|
||||
|
||||
### Start the Gateway
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
The bot should come online within seconds. Send it a message on Telegram to verify.
|
||||
|
||||
## Home Channel
|
||||
|
||||
Use the `/sethome` command in any Telegram chat (DM or group) to designate it as the **home channel**. Scheduled tasks (cron jobs) deliver their results to this channel.
|
||||
|
||||
You can also set it manually in `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
TELEGRAM_HOME_CHANNEL=-1001234567890
|
||||
TELEGRAM_HOME_CHANNEL_NAME="My Notes"
|
||||
```
|
||||
|
||||
:::tip
|
||||
Group chat IDs are negative numbers (e.g., `-1001234567890`). Your personal DM chat ID is the same as your user ID.
|
||||
:::
|
||||
|
||||
## Voice Messages
|
||||
|
||||
### Incoming Voice (Speech-to-Text)
|
||||
|
||||
Voice messages you send on Telegram are automatically transcribed by Hermes's configured STT provider and injected as text into the conversation.
|
||||
|
||||
- `local` uses `faster-whisper` on the machine running Hermes — no API key required
|
||||
- `groq` uses Groq Whisper and requires `GROQ_API_KEY`
|
||||
- `openai` uses OpenAI Whisper and requires `VOICE_TOOLS_OPENAI_KEY`
|
||||
|
||||
### Outgoing Voice (Text-to-Speech)
|
||||
|
||||
When the agent generates audio via TTS, it's delivered as native Telegram **voice bubbles** — the round, inline-playable kind.
|
||||
|
||||
- **OpenAI and ElevenLabs** produce Opus natively — no extra setup needed
|
||||
- **Edge TTS** (the default free provider) outputs MP3 and requires **ffmpeg** to convert to Opus:
|
||||
|
||||
```bash
|
||||
# Ubuntu/Debian
|
||||
sudo apt install ffmpeg
|
||||
|
||||
# macOS
|
||||
brew install ffmpeg
|
||||
```
|
||||
|
||||
Without ffmpeg, Edge TTS audio is sent as a regular audio file (still playable, but uses the rectangular player instead of a voice bubble).
|
||||
|
||||
Configure the TTS provider in your `config.yaml` under the `tts.provider` key.
|
||||
|
||||
## Group Chat Usage
|
||||
|
||||
Hermes Agent works in Telegram group chats with a few considerations:
|
||||
|
||||
- **Privacy mode** determines what messages the bot can see (see [Step 3](#step-3-privacy-mode-critical-for-groups))
|
||||
- When privacy mode is on, **@mention the bot** (e.g., `@my_hermes_bot what's the weather?`) or **reply to its messages** to interact
|
||||
- When privacy mode is off (or bot is admin), the bot sees all messages and can participate naturally
|
||||
- `TELEGRAM_ALLOWED_USERS` still applies — only authorized users can trigger the bot, even in groups
|
||||
|
||||
## Recent Bot API Features (2024–2025)
|
||||
|
||||
- **Privacy policy:** Telegram now requires bots to have a privacy policy. Set one via BotFather with `/setprivacy_policy`, or Telegram may auto-generate a placeholder. This is particularly important if your bot is public-facing.
|
||||
- **Message streaming:** Bot API 9.x added support for streaming long responses, which can improve perceived latency for lengthy agent replies.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Solution |
|
||||
|---------|----------|
|
||||
| Bot not responding at all | Verify `TELEGRAM_BOT_TOKEN` is correct. Check `hermes gateway` logs for errors. |
|
||||
| Bot responds with "unauthorized" | Your user ID is not in `TELEGRAM_ALLOWED_USERS`. Double-check with @userinfobot. |
|
||||
| Bot ignores group messages | Privacy mode is likely on. Disable it (Step 3) or make the bot a group admin. **Remember to remove and re-add the bot after changing privacy.** |
|
||||
| Voice messages not transcribed | Verify STT is available: install `faster-whisper` for local transcription, or set `GROQ_API_KEY` / `VOICE_TOOLS_OPENAI_KEY` in `~/.hermes/.env`. |
|
||||
| Voice replies are files, not bubbles | Install `ffmpeg` (needed for Edge TTS Opus conversion). |
|
||||
| Bot token revoked/invalid | Generate a new token via `/revoke` then `/newbot` or `/token` in BotFather. Update your `.env` file. |
|
||||
|
||||
## Exec Approval
|
||||
|
||||
When the agent tries to run a potentially dangerous command, it asks you for approval in the chat:
|
||||
|
||||
> ⚠️ This command is potentially dangerous (recursive delete). Reply "yes" to approve.
|
||||
|
||||
Reply "yes"/"y" to approve or "no"/"n" to deny.
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
Always set `TELEGRAM_ALLOWED_USERS` to restrict who can interact with your bot. Without it, the gateway denies all users by default as a safety measure.
|
||||
:::
|
||||
|
||||
Never share your bot token publicly. If compromised, revoke it immediately via BotFather's `/revoke` command.
|
||||
|
||||
For more details, see the [Security documentation](/user-guide/security). You can also use [DM pairing](/user-guide/messaging#dm-pairing-alternative-to-allowlists) for a more dynamic approach to user authorization.
|
||||
310
hermes_code/website/docs/user-guide/messaging/webhooks.md
Normal file
310
hermes_code/website/docs/user-guide/messaging/webhooks.md
Normal file
|
|
@ -0,0 +1,310 @@
|
|||
---
|
||||
sidebar_position: 13
|
||||
title: "Webhooks"
|
||||
description: "Receive events from GitHub, GitLab, and other services to trigger Hermes agent runs"
|
||||
---
|
||||
|
||||
# Webhooks
|
||||
|
||||
Receive events from external services (GitHub, GitLab, JIRA, Stripe, etc.) and trigger Hermes agent runs automatically. The webhook adapter runs an HTTP server that accepts POST requests, validates HMAC signatures, transforms payloads into agent prompts, and routes responses back to the source or to another configured platform.
|
||||
|
||||
The agent processes the event and can respond by posting comments on PRs, sending messages to Telegram/Discord, or logging the result.
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. Enable via `hermes gateway setup` or environment variables
|
||||
2. Define webhook routes in `config.yaml`
|
||||
3. Point your service at `http://your-server:8644/webhooks/<route-name>`
|
||||
|
||||
---
|
||||
|
||||
## Setup
|
||||
|
||||
There are two ways to enable the webhook adapter.
|
||||
|
||||
### Via setup wizard
|
||||
|
||||
```bash
|
||||
hermes gateway setup
|
||||
```
|
||||
|
||||
Follow the prompts to enable webhooks, set the port, and set a global HMAC secret.
|
||||
|
||||
### Via environment variables
|
||||
|
||||
Add to `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
WEBHOOK_ENABLED=true
|
||||
WEBHOOK_PORT=8644 # default
|
||||
WEBHOOK_SECRET=your-global-secret
|
||||
```
|
||||
|
||||
### Verify the server
|
||||
|
||||
Once the gateway is running:
|
||||
|
||||
```bash
|
||||
curl http://localhost:8644/health
|
||||
```
|
||||
|
||||
Expected response:
|
||||
|
||||
```json
|
||||
{"status": "ok", "platform": "webhook"}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuring Routes {#configuring-routes}
|
||||
|
||||
Routes define how different webhook sources are handled. Each route is a named entry under `platforms.webhook.extra.routes` in your `config.yaml`.
|
||||
|
||||
### Route properties
|
||||
|
||||
| Property | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `events` | No | List of event types to accept (e.g. `["pull_request"]`). If empty, all events are accepted. Event type is read from `X-GitHub-Event`, `X-GitLab-Event`, or `event_type` in the payload. |
|
||||
| `secret` | **Yes** | HMAC secret for signature validation. Falls back to the global `secret` if not set on the route. Set to `"INSECURE_NO_AUTH"` for testing only (skips validation). |
|
||||
| `prompt` | No | Template string with dot-notation payload access (e.g. `{pull_request.title}`). If omitted, the full JSON payload is dumped into the prompt. |
|
||||
| `skills` | No | List of skill names to load for the agent run. |
|
||||
| `deliver` | No | Where to send the response: `github_comment`, `telegram`, `discord`, `slack`, `signal`, `sms`, or `log` (default). |
|
||||
| `deliver_extra` | No | Additional delivery config — keys depend on `deliver` type (e.g. `repo`, `pr_number`, `chat_id`). Values support the same `{dot.notation}` templates as `prompt`. |
|
||||
|
||||
### Full example
|
||||
|
||||
```yaml
|
||||
platforms:
|
||||
webhook:
|
||||
enabled: true
|
||||
extra:
|
||||
port: 8644
|
||||
secret: "global-fallback-secret"
|
||||
routes:
|
||||
github-pr:
|
||||
events: ["pull_request"]
|
||||
secret: "github-webhook-secret"
|
||||
prompt: |
|
||||
Review this pull request:
|
||||
Repository: {repository.full_name}
|
||||
PR #{number}: {pull_request.title}
|
||||
Author: {pull_request.user.login}
|
||||
URL: {pull_request.html_url}
|
||||
Diff URL: {pull_request.diff_url}
|
||||
Action: {action}
|
||||
skills: ["github-code-review"]
|
||||
deliver: "github_comment"
|
||||
deliver_extra:
|
||||
repo: "{repository.full_name}"
|
||||
pr_number: "{number}"
|
||||
deploy-notify:
|
||||
events: ["push"]
|
||||
secret: "deploy-secret"
|
||||
prompt: "New push to {repository.full_name} branch {ref}: {head_commit.message}"
|
||||
deliver: "telegram"
|
||||
```
|
||||
|
||||
### Prompt Templates
|
||||
|
||||
Prompts use dot-notation to access nested fields in the webhook payload:
|
||||
|
||||
- `{pull_request.title}` resolves to `payload["pull_request"]["title"]`
|
||||
- `{repository.full_name}` resolves to `payload["repository"]["full_name"]`
|
||||
- Missing keys are left as the literal `{key}` string (no error)
|
||||
- Nested dicts and lists are JSON-serialized and truncated at 2000 characters
|
||||
|
||||
If no `prompt` template is configured for a route, the entire payload is dumped as indented JSON (truncated at 4000 characters).
|
||||
|
||||
The same dot-notation templates work in `deliver_extra` values.
|
||||
|
||||
---
|
||||
|
||||
## GitHub PR Review (Step by Step) {#github-pr-review}
|
||||
|
||||
This walkthrough sets up automatic code review on every pull request.
|
||||
|
||||
### 1. Create the webhook in GitHub
|
||||
|
||||
1. Go to your repository → **Settings** → **Webhooks** → **Add webhook**
|
||||
2. Set **Payload URL** to `http://your-server:8644/webhooks/github-pr`
|
||||
3. Set **Content type** to `application/json`
|
||||
4. Set **Secret** to match your route config (e.g. `github-webhook-secret`)
|
||||
5. Under **Which events?**, select **Let me select individual events** and check **Pull requests**
|
||||
6. Click **Add webhook**
|
||||
|
||||
### 2. Add the route config
|
||||
|
||||
Add the `github-pr` route to your `~/.hermes/config.yaml` as shown in the example above.
|
||||
|
||||
### 3. Ensure `gh` CLI is authenticated
|
||||
|
||||
The `github_comment` delivery type uses the GitHub CLI to post comments:
|
||||
|
||||
```bash
|
||||
gh auth login
|
||||
```
|
||||
|
||||
### 4. Test it
|
||||
|
||||
Open a pull request on the repository. The webhook fires, Hermes processes the event, and posts a review comment on the PR.
|
||||
|
||||
---
|
||||
|
||||
## GitLab Webhook Setup {#gitlab-webhook-setup}
|
||||
|
||||
GitLab webhooks work similarly but use a different authentication mechanism. GitLab sends the secret as a plain `X-Gitlab-Token` header (exact string match, not HMAC).
|
||||
|
||||
### 1. Create the webhook in GitLab
|
||||
|
||||
1. Go to your project → **Settings** → **Webhooks**
|
||||
2. Set the **URL** to `http://your-server:8644/webhooks/gitlab-mr`
|
||||
3. Enter your **Secret token**
|
||||
4. Select **Merge request events** (and any other events you want)
|
||||
5. Click **Add webhook**
|
||||
|
||||
### 2. Add the route config
|
||||
|
||||
```yaml
|
||||
platforms:
|
||||
webhook:
|
||||
enabled: true
|
||||
extra:
|
||||
routes:
|
||||
gitlab-mr:
|
||||
events: ["merge_request"]
|
||||
secret: "your-gitlab-secret-token"
|
||||
prompt: |
|
||||
Review this merge request:
|
||||
Project: {project.path_with_namespace}
|
||||
MR !{object_attributes.iid}: {object_attributes.title}
|
||||
Author: {object_attributes.last_commit.author.name}
|
||||
URL: {object_attributes.url}
|
||||
Action: {object_attributes.action}
|
||||
deliver: "log"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Delivery Options {#delivery-options}
|
||||
|
||||
The `deliver` field controls where the agent's response goes after processing the webhook event.
|
||||
|
||||
| Deliver Type | Description |
|
||||
|-------------|-------------|
|
||||
| `log` | Logs the response to the gateway log output. This is the default and is useful for testing. |
|
||||
| `github_comment` | Posts the response as a PR/issue comment via the `gh` CLI. Requires `deliver_extra.repo` and `deliver_extra.pr_number`. The `gh` CLI must be installed and authenticated on the gateway host (`gh auth login`). |
|
||||
| `telegram` | Routes the response to Telegram. Uses the home channel, or specify `chat_id` in `deliver_extra`. |
|
||||
| `discord` | Routes the response to Discord. Uses the home channel, or specify `chat_id` in `deliver_extra`. |
|
||||
| `slack` | Routes the response to Slack. Uses the home channel, or specify `chat_id` in `deliver_extra`. |
|
||||
| `signal` | Routes the response to Signal. Uses the home channel, or specify `chat_id` in `deliver_extra`. |
|
||||
| `sms` | Routes the response to SMS via Twilio. Uses the home channel, or specify `chat_id` in `deliver_extra`. |
|
||||
|
||||
For cross-platform delivery (telegram, discord, slack, signal, sms), the target platform must also be enabled and connected in the gateway. If no `chat_id` is provided in `deliver_extra`, the response is sent to that platform's configured home channel.
|
||||
|
||||
---
|
||||
|
||||
## Security {#security}
|
||||
|
||||
The webhook adapter includes multiple layers of security:
|
||||
|
||||
### HMAC signature validation
|
||||
|
||||
The adapter validates incoming webhook signatures using the appropriate method for each source:
|
||||
|
||||
- **GitHub**: `X-Hub-Signature-256` header — HMAC-SHA256 hex digest prefixed with `sha256=`
|
||||
- **GitLab**: `X-Gitlab-Token` header — plain secret string match
|
||||
- **Generic**: `X-Webhook-Signature` header — raw HMAC-SHA256 hex digest
|
||||
|
||||
If a secret is configured but no recognized signature header is present, the request is rejected.
|
||||
|
||||
### Secret is required
|
||||
|
||||
Every route must have a secret — either set directly on the route or inherited from the global `secret`. Routes without a secret cause the adapter to fail at startup with an error. For development/testing only, you can set the secret to `"INSECURE_NO_AUTH"` to skip validation entirely.
|
||||
|
||||
### Rate limiting
|
||||
|
||||
Each route is rate-limited to **30 requests per minute** by default (fixed-window). Configure this globally:
|
||||
|
||||
```yaml
|
||||
platforms:
|
||||
webhook:
|
||||
extra:
|
||||
rate_limit: 60 # requests per minute
|
||||
```
|
||||
|
||||
Requests exceeding the limit receive a `429 Too Many Requests` response.
|
||||
|
||||
### Idempotency
|
||||
|
||||
Delivery IDs (from `X-GitHub-Delivery`, `X-Request-ID`, or a timestamp fallback) are cached for **1 hour**. Duplicate deliveries (e.g. webhook retries) are silently skipped with a `200` response, preventing duplicate agent runs.
|
||||
|
||||
### Body size limits
|
||||
|
||||
Payloads exceeding **1 MB** are rejected before the body is read. Configure this:
|
||||
|
||||
```yaml
|
||||
platforms:
|
||||
webhook:
|
||||
extra:
|
||||
max_body_bytes: 2097152 # 2 MB
|
||||
```
|
||||
|
||||
### Prompt injection risk
|
||||
|
||||
:::warning
|
||||
Webhook payloads contain attacker-controlled data — PR titles, commit messages, issue descriptions, etc. can all contain malicious instructions. Run the gateway in a sandboxed environment (Docker, VM) when exposed to the internet. Consider using the Docker or SSH terminal backend for isolation.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting {#troubleshooting}
|
||||
|
||||
### Webhook not arriving
|
||||
|
||||
- Verify the port is exposed and accessible from the webhook source
|
||||
- Check firewall rules — port `8644` (or your configured port) must be open
|
||||
- Verify the URL path matches: `http://your-server:8644/webhooks/<route-name>`
|
||||
- Use the `/health` endpoint to confirm the server is running
|
||||
|
||||
### Signature validation failing
|
||||
|
||||
- Ensure the secret in your route config exactly matches the secret configured in the webhook source
|
||||
- For GitHub, the secret is HMAC-based — check `X-Hub-Signature-256`
|
||||
- For GitLab, the secret is a plain token match — check `X-Gitlab-Token`
|
||||
- Check gateway logs for `Invalid signature` warnings
|
||||
|
||||
### Event being ignored
|
||||
|
||||
- Check that the event type is in your route's `events` list
|
||||
- GitHub events use values like `pull_request`, `push`, `issues` (the `X-GitHub-Event` header value)
|
||||
- GitLab events use values like `merge_request`, `push` (the `X-GitLab-Event` header value)
|
||||
- If `events` is empty or not set, all events are accepted
|
||||
|
||||
### Agent not responding
|
||||
|
||||
- Run the gateway in foreground to see logs: `hermes gateway run`
|
||||
- Check that the prompt template is rendering correctly
|
||||
- Verify the delivery target is configured and connected
|
||||
|
||||
### Duplicate responses
|
||||
|
||||
- The idempotency cache should prevent this — check that the webhook source is sending a delivery ID header (`X-GitHub-Delivery` or `X-Request-ID`)
|
||||
- Delivery IDs are cached for 1 hour
|
||||
|
||||
### `gh` CLI errors (GitHub comment delivery)
|
||||
|
||||
- Run `gh auth login` on the gateway host
|
||||
- Ensure the authenticated GitHub user has write access to the repository
|
||||
- Check that `gh` is installed and on the PATH
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables {#environment-variables}
|
||||
|
||||
| Variable | Description | Default |
|
||||
|----------|-------------|---------|
|
||||
| `WEBHOOK_ENABLED` | Enable the webhook platform adapter | `false` |
|
||||
| `WEBHOOK_PORT` | HTTP server port for receiving webhooks | `8644` |
|
||||
| `WEBHOOK_SECRET` | Global HMAC secret (used as fallback when routes don't specify their own) | _(none)_ |
|
||||
200
hermes_code/website/docs/user-guide/messaging/whatsapp.md
Normal file
200
hermes_code/website/docs/user-guide/messaging/whatsapp.md
Normal file
|
|
@ -0,0 +1,200 @@
|
|||
---
|
||||
sidebar_position: 5
|
||||
title: "WhatsApp"
|
||||
description: "Set up Hermes Agent as a WhatsApp bot via the built-in Baileys bridge"
|
||||
---
|
||||
|
||||
# WhatsApp Setup
|
||||
|
||||
Hermes connects to WhatsApp through a built-in bridge based on **Baileys**. This works by emulating a WhatsApp Web session — **not** through the official WhatsApp Business API. No Meta developer account or Business verification is required.
|
||||
|
||||
:::warning Unofficial API — Ban Risk
|
||||
WhatsApp does **not** officially support third-party bots outside the Business API. Using a third-party bridge carries a small risk of account restrictions. To minimize risk:
|
||||
- **Use a dedicated phone number** for the bot (not your personal number)
|
||||
- **Don't send bulk/spam messages** — keep usage conversational
|
||||
- **Don't automate outbound messaging** to people who haven't messaged first
|
||||
:::
|
||||
|
||||
:::warning WhatsApp Web Protocol Updates
|
||||
WhatsApp periodically updates their Web protocol, which can temporarily break compatibility
|
||||
with third-party bridges. When this happens, Hermes will update the bridge dependency. If the
|
||||
bot stops working after a WhatsApp update, pull the latest Hermes version and re-pair.
|
||||
:::
|
||||
|
||||
## Two Modes
|
||||
|
||||
| Mode | How it works | Best for |
|
||||
|------|-------------|----------|
|
||||
| **Separate bot number** (recommended) | Dedicate a phone number to the bot. People message that number directly. | Clean UX, multiple users, lower ban risk |
|
||||
| **Personal self-chat** | Use your own WhatsApp. You message yourself to talk to the agent. | Quick setup, single user, testing |
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Node.js v18+** and **npm** — the WhatsApp bridge runs as a Node.js process
|
||||
- **A phone with WhatsApp** installed (for scanning the QR code)
|
||||
|
||||
Unlike older browser-driven bridges, the current Baileys-based bridge does **not** require a local Chromium or Puppeteer dependency stack.
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Run the Setup Wizard
|
||||
|
||||
```bash
|
||||
hermes whatsapp
|
||||
```
|
||||
|
||||
The wizard will:
|
||||
|
||||
1. Ask which mode you want (**bot** or **self-chat**)
|
||||
2. Install bridge dependencies if needed
|
||||
3. Display a **QR code** in your terminal
|
||||
4. Wait for you to scan it
|
||||
|
||||
**To scan the QR code:**
|
||||
|
||||
1. Open WhatsApp on your phone
|
||||
2. Go to **Settings → Linked Devices**
|
||||
3. Tap **Link a Device**
|
||||
4. Point your camera at the terminal QR code
|
||||
|
||||
Once paired, the wizard confirms the connection and exits. Your session is saved automatically.
|
||||
|
||||
:::tip
|
||||
If the QR code looks garbled, make sure your terminal is at least 60 columns wide and supports
|
||||
Unicode. You can also try a different terminal emulator.
|
||||
:::
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Getting a Second Phone Number (Bot Mode)
|
||||
|
||||
For bot mode, you need a phone number that isn't already registered with WhatsApp. Three options:
|
||||
|
||||
| Option | Cost | Notes |
|
||||
|--------|------|-------|
|
||||
| **Google Voice** | Free | US only. Get a number at [voice.google.com](https://voice.google.com). Verify WhatsApp via SMS through the Google Voice app. |
|
||||
| **Prepaid SIM** | $5–15 one-time | Any carrier. Activate, verify WhatsApp, then the SIM can sit in a drawer. Number must stay active (make a call every 90 days). |
|
||||
| **VoIP services** | Free–$5/month | TextNow, TextFree, or similar. Some VoIP numbers are blocked by WhatsApp — try a few if the first doesn't work. |
|
||||
|
||||
After getting the number:
|
||||
|
||||
1. Install WhatsApp on a phone (or use WhatsApp Business app with dual-SIM)
|
||||
2. Register the new number with WhatsApp
|
||||
3. Run `hermes whatsapp` and scan the QR code from that WhatsApp account
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Configure Hermes
|
||||
|
||||
Add the following to your `~/.hermes/.env` file:
|
||||
|
||||
```bash
|
||||
# Required
|
||||
WHATSAPP_ENABLED=true
|
||||
WHATSAPP_MODE=bot # "bot" or "self-chat"
|
||||
WHATSAPP_ALLOWED_USERS=15551234567 # Comma-separated phone numbers (with country code, no +)
|
||||
```
|
||||
|
||||
Optional behavior settings in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
unauthorized_dm_behavior: pair
|
||||
|
||||
whatsapp:
|
||||
unauthorized_dm_behavior: ignore
|
||||
```
|
||||
|
||||
- `unauthorized_dm_behavior: pair` is the global default. Unknown DM senders get a pairing code.
|
||||
- `whatsapp.unauthorized_dm_behavior: ignore` makes WhatsApp stay silent for unauthorized DMs, which is usually the better choice for a private number.
|
||||
|
||||
Then start the gateway:
|
||||
|
||||
```bash
|
||||
hermes gateway # Foreground
|
||||
hermes gateway install # Install as a user service
|
||||
sudo hermes gateway install --system # Linux only: boot-time system service
|
||||
```
|
||||
|
||||
The gateway starts the WhatsApp bridge automatically using the saved session.
|
||||
|
||||
---
|
||||
|
||||
## Session Persistence
|
||||
|
||||
The Baileys bridge saves its session under `~/.hermes/whatsapp/session`. This means:
|
||||
|
||||
- **Sessions survive restarts** — you don't need to re-scan the QR code every time
|
||||
- The session data includes encryption keys and device credentials
|
||||
- **Do not share or commit this session directory** — it grants full access to the WhatsApp account
|
||||
|
||||
---
|
||||
|
||||
## Re-pairing
|
||||
|
||||
If the session breaks (phone reset, WhatsApp update, manually unlinked), you'll see connection
|
||||
errors in the gateway logs. To fix it:
|
||||
|
||||
```bash
|
||||
hermes whatsapp
|
||||
```
|
||||
|
||||
This generates a fresh QR code. Scan it again and the session is re-established. The gateway
|
||||
handles **temporary** disconnections (network blips, phone going offline briefly) automatically
|
||||
with reconnection logic.
|
||||
|
||||
---
|
||||
|
||||
## Voice Messages
|
||||
|
||||
Hermes supports voice on WhatsApp:
|
||||
|
||||
- **Incoming:** Voice messages (`.ogg` opus) are automatically transcribed using the configured STT provider: local `faster-whisper`, Groq Whisper (`GROQ_API_KEY`), or OpenAI Whisper (`VOICE_TOOLS_OPENAI_KEY`)
|
||||
- **Outgoing:** TTS responses are sent as MP3 audio file attachments
|
||||
- Agent responses are prefixed with "⚕ **Hermes Agent**" by default. You can customize or disable this in `config.yaml`:
|
||||
|
||||
```yaml
|
||||
# ~/.hermes/config.yaml
|
||||
whatsapp:
|
||||
reply_prefix: "" # Empty string disables the header
|
||||
# reply_prefix: "🤖 *My Bot*\n──────\n" # Custom prefix (supports \n for newlines)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Solution |
|
||||
|---------|----------|
|
||||
| **QR code not scanning** | Ensure terminal is wide enough (60+ columns). Try a different terminal. Make sure you're scanning from the correct WhatsApp account (bot number, not personal). |
|
||||
| **QR code expires** | QR codes refresh every ~20 seconds. If it times out, restart `hermes whatsapp`. |
|
||||
| **Session not persisting** | Check that `~/.hermes/whatsapp/session` exists and is writable. If containerized, mount it as a persistent volume. |
|
||||
| **Logged out unexpectedly** | WhatsApp unlinks devices after long inactivity. Keep the phone on and connected to the network, then re-pair with `hermes whatsapp` if needed. |
|
||||
| **Bridge crashes or reconnect loops** | Restart the gateway, update Hermes, and re-pair if the session was invalidated by a WhatsApp protocol change. |
|
||||
| **Bot stops working after WhatsApp update** | Update Hermes to get the latest bridge version, then re-pair. |
|
||||
| **Messages not being received** | Verify `WHATSAPP_ALLOWED_USERS` includes the sender's number (with country code, no `+` or spaces). |
|
||||
| **Bot replies to strangers with a pairing code** | Set `whatsapp.unauthorized_dm_behavior: ignore` in `~/.hermes/config.yaml` if you want unauthorized DMs to be silently ignored instead. |
|
||||
|
||||
---
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
**Always set `WHATSAPP_ALLOWED_USERS`** with phone numbers (including country code, without the `+`)
|
||||
of authorized users. Without this setting, the gateway will **deny all incoming messages** as a
|
||||
safety measure.
|
||||
:::
|
||||
|
||||
By default, unauthorized DMs still receive a pairing code reply. If you want a private WhatsApp number to stay completely silent to strangers, set:
|
||||
|
||||
```yaml
|
||||
whatsapp:
|
||||
unauthorized_dm_behavior: ignore
|
||||
```
|
||||
|
||||
- The `~/.hermes/whatsapp/session` directory contains full session credentials — protect it like a password
|
||||
- Set file permissions: `chmod 700 ~/.hermes/whatsapp/session`
|
||||
- Use a **dedicated phone number** for the bot to isolate risk from your personal account
|
||||
- If you suspect compromise, unlink the device from WhatsApp → Settings → Linked Devices
|
||||
- Phone numbers in logs are partially redacted, but review your log retention policy
|
||||
450
hermes_code/website/docs/user-guide/security.md
Normal file
450
hermes_code/website/docs/user-guide/security.md
Normal file
|
|
@ -0,0 +1,450 @@
|
|||
---
|
||||
sidebar_position: 8
|
||||
title: "Security"
|
||||
description: "Security model, dangerous command approval, user authorization, container isolation, and production deployment best practices"
|
||||
---
|
||||
|
||||
# Security
|
||||
|
||||
Hermes Agent is designed with a defense-in-depth security model. This page covers every security boundary — from command approval to container isolation to user authorization on messaging platforms.
|
||||
|
||||
## Overview
|
||||
|
||||
The security model has five layers:
|
||||
|
||||
1. **User authorization** — who can talk to the agent (allowlists, DM pairing)
|
||||
2. **Dangerous command approval** — human-in-the-loop for destructive operations
|
||||
3. **Container isolation** — Docker/Singularity/Modal sandboxing with hardened settings
|
||||
4. **MCP credential filtering** — environment variable isolation for MCP subprocesses
|
||||
5. **Context file scanning** — prompt injection detection in project files
|
||||
|
||||
## Dangerous Command Approval
|
||||
|
||||
Before executing any command, Hermes checks it against a curated list of dangerous patterns. If a match is found, the user must explicitly approve it.
|
||||
|
||||
### What Triggers Approval
|
||||
|
||||
The following patterns trigger approval prompts (defined in `tools/approval.py`):
|
||||
|
||||
| Pattern | Description |
|
||||
|---------|-------------|
|
||||
| `rm -r` / `rm --recursive` | Recursive delete |
|
||||
| `rm ... /` | Delete in root path |
|
||||
| `chmod 777` | World-writable permissions |
|
||||
| `mkfs` | Format filesystem |
|
||||
| `dd if=` | Disk copy |
|
||||
| `DROP TABLE/DATABASE` | SQL DROP |
|
||||
| `DELETE FROM` (without WHERE) | SQL DELETE without WHERE |
|
||||
| `TRUNCATE TABLE` | SQL TRUNCATE |
|
||||
| `> /etc/` | Overwrite system config |
|
||||
| `systemctl stop/disable/mask` | Stop/disable system services |
|
||||
| `kill -9 -1` | Kill all processes |
|
||||
| `curl ... \| sh` | Pipe remote content to shell |
|
||||
| `bash -c`, `python -e` | Shell/script execution via flags |
|
||||
| `find -exec rm`, `find -delete` | Find with destructive actions |
|
||||
| Fork bomb patterns | Fork bombs |
|
||||
|
||||
:::info
|
||||
**Container bypass**: When running in `docker`, `singularity`, `modal`, or `daytona` backends, dangerous command checks are **skipped** because the container itself is the security boundary. Destructive commands inside a container can't harm the host.
|
||||
:::
|
||||
|
||||
### Approval Flow (CLI)
|
||||
|
||||
In the interactive CLI, dangerous commands show an inline approval prompt:
|
||||
|
||||
```
|
||||
⚠️ DANGEROUS COMMAND: recursive delete
|
||||
rm -rf /tmp/old-project
|
||||
|
||||
[o]nce | [s]ession | [a]lways | [d]eny
|
||||
|
||||
Choice [o/s/a/D]:
|
||||
```
|
||||
|
||||
The four options:
|
||||
|
||||
- **once** — allow this single execution
|
||||
- **session** — allow this pattern for the rest of the session
|
||||
- **always** — add to permanent allowlist (saved to `config.yaml`)
|
||||
- **deny** (default) — block the command
|
||||
|
||||
### Approval Flow (Gateway/Messaging)
|
||||
|
||||
On messaging platforms, the agent sends the dangerous command details to the chat and waits for the user to reply:
|
||||
|
||||
- Reply **yes**, **y**, **approve**, **ok**, or **go** to approve
|
||||
- Reply **no**, **n**, **deny**, or **cancel** to deny
|
||||
|
||||
The `HERMES_EXEC_ASK=1` environment variable is automatically set when running the gateway.
|
||||
|
||||
### Permanent Allowlist
|
||||
|
||||
Commands approved with "always" are saved to `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
# Permanently allowed dangerous command patterns
|
||||
command_allowlist:
|
||||
- rm
|
||||
- systemctl
|
||||
```
|
||||
|
||||
These patterns are loaded at startup and silently approved in all future sessions.
|
||||
|
||||
:::tip
|
||||
Use `hermes config edit` to review or remove patterns from your permanent allowlist.
|
||||
:::
|
||||
|
||||
## User Authorization (Gateway)
|
||||
|
||||
When running the messaging gateway, Hermes controls who can interact with the bot through a layered authorization system.
|
||||
|
||||
### Authorization Check Order
|
||||
|
||||
The `_is_user_authorized()` method checks in this order:
|
||||
|
||||
1. **Per-platform allow-all flag** (e.g., `DISCORD_ALLOW_ALL_USERS=true`)
|
||||
2. **DM pairing approved list** (users approved via pairing codes)
|
||||
3. **Platform-specific allowlists** (e.g., `TELEGRAM_ALLOWED_USERS=12345,67890`)
|
||||
4. **Global allowlist** (`GATEWAY_ALLOWED_USERS=12345,67890`)
|
||||
5. **Global allow-all** (`GATEWAY_ALLOW_ALL_USERS=true`)
|
||||
6. **Default: deny**
|
||||
|
||||
### Platform Allowlists
|
||||
|
||||
Set allowed user IDs as comma-separated values in `~/.hermes/.env`:
|
||||
|
||||
```bash
|
||||
# Platform-specific allowlists
|
||||
TELEGRAM_ALLOWED_USERS=123456789,987654321
|
||||
DISCORD_ALLOWED_USERS=111222333444555666
|
||||
WHATSAPP_ALLOWED_USERS=15551234567
|
||||
SLACK_ALLOWED_USERS=U01ABC123
|
||||
|
||||
# Cross-platform allowlist (checked for all platforms)
|
||||
GATEWAY_ALLOWED_USERS=123456789
|
||||
|
||||
# Per-platform allow-all (use with caution)
|
||||
DISCORD_ALLOW_ALL_USERS=true
|
||||
|
||||
# Global allow-all (use with extreme caution)
|
||||
GATEWAY_ALLOW_ALL_USERS=true
|
||||
```
|
||||
|
||||
:::warning
|
||||
If **no allowlists are configured** and `GATEWAY_ALLOW_ALL_USERS` is not set, **all users are denied**. The gateway logs a warning at startup:
|
||||
|
||||
```
|
||||
No user allowlists configured. All unauthorized users will be denied.
|
||||
Set GATEWAY_ALLOW_ALL_USERS=true in ~/.hermes/.env to allow open access,
|
||||
or configure platform allowlists (e.g., TELEGRAM_ALLOWED_USERS=your_id).
|
||||
```
|
||||
:::
|
||||
|
||||
### DM Pairing System
|
||||
|
||||
For more flexible authorization, Hermes includes a code-based pairing system. Instead of requiring user IDs upfront, unknown users receive a one-time pairing code that the bot owner approves via the CLI.
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. An unknown user sends a DM to the bot
|
||||
2. The bot replies with an 8-character pairing code
|
||||
3. The bot owner runs `hermes pairing approve <platform> <code>` on the CLI
|
||||
4. The user is permanently approved for that platform
|
||||
|
||||
Control how unauthorized direct messages are handled in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
unauthorized_dm_behavior: pair
|
||||
|
||||
whatsapp:
|
||||
unauthorized_dm_behavior: ignore
|
||||
```
|
||||
|
||||
- `pair` is the default. Unauthorized DMs get a pairing code reply.
|
||||
- `ignore` silently drops unauthorized DMs.
|
||||
- Platform sections override the global default, so you can keep pairing on Telegram while keeping WhatsApp silent.
|
||||
|
||||
**Security features** (based on OWASP + NIST SP 800-63-4 guidance):
|
||||
|
||||
| Feature | Details |
|
||||
|---------|---------|
|
||||
| Code format | 8-char from 32-char unambiguous alphabet (no 0/O/1/I) |
|
||||
| Randomness | Cryptographic (`secrets.choice()`) |
|
||||
| Code TTL | 1 hour expiry |
|
||||
| Rate limiting | 1 request per user per 10 minutes |
|
||||
| Pending limit | Max 3 pending codes per platform |
|
||||
| Lockout | 5 failed approval attempts → 1-hour lockout |
|
||||
| File security | `chmod 0600` on all pairing data files |
|
||||
| Logging | Codes are never logged to stdout |
|
||||
|
||||
**Pairing CLI commands:**
|
||||
|
||||
```bash
|
||||
# List pending and approved users
|
||||
hermes pairing list
|
||||
|
||||
# Approve a pairing code
|
||||
hermes pairing approve telegram ABC12DEF
|
||||
|
||||
# Revoke a user's access
|
||||
hermes pairing revoke telegram 123456789
|
||||
|
||||
# Clear all pending codes
|
||||
hermes pairing clear-pending
|
||||
```
|
||||
|
||||
**Storage:** Pairing data is stored in `~/.hermes/pairing/` with per-platform JSON files:
|
||||
- `{platform}-pending.json` — pending pairing requests
|
||||
- `{platform}-approved.json` — approved users
|
||||
- `_rate_limits.json` — rate limit and lockout tracking
|
||||
|
||||
## Container Isolation
|
||||
|
||||
When using the `docker` terminal backend, Hermes applies strict security hardening to every container.
|
||||
|
||||
### Docker Security Flags
|
||||
|
||||
Every container runs with these flags (defined in `tools/environments/docker.py`):
|
||||
|
||||
```python
|
||||
_SECURITY_ARGS = [
|
||||
"--cap-drop", "ALL", # Drop ALL Linux capabilities
|
||||
"--security-opt", "no-new-privileges", # Block privilege escalation
|
||||
"--pids-limit", "256", # Limit process count
|
||||
"--tmpfs", "/tmp:rw,nosuid,size=512m", # Size-limited /tmp
|
||||
"--tmpfs", "/var/tmp:rw,noexec,nosuid,size=256m", # No-exec /var/tmp
|
||||
"--tmpfs", "/run:rw,noexec,nosuid,size=64m", # No-exec /run
|
||||
]
|
||||
```
|
||||
|
||||
### Resource Limits
|
||||
|
||||
Container resources are configurable in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: docker
|
||||
docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
|
||||
docker_forward_env: [] # Explicit allowlist only; empty keeps secrets out of the container
|
||||
container_cpu: 1 # CPU cores
|
||||
container_memory: 5120 # MB (default 5GB)
|
||||
container_disk: 51200 # MB (default 50GB, requires overlay2 on XFS)
|
||||
container_persistent: true # Persist filesystem across sessions
|
||||
```
|
||||
|
||||
### Filesystem Persistence
|
||||
|
||||
- **Persistent mode** (`container_persistent: true`): Bind-mounts `/workspace` and `/root` from `~/.hermes/sandboxes/docker/<task_id>/`
|
||||
- **Ephemeral mode** (`container_persistent: false`): Uses tmpfs for workspace — everything is lost on cleanup
|
||||
|
||||
:::tip
|
||||
For production gateway deployments, use `docker`, `modal`, or `daytona` backend to isolate agent commands from your host system. This eliminates the need for dangerous command approval entirely.
|
||||
:::
|
||||
|
||||
:::warning
|
||||
If you add names to `terminal.docker_forward_env`, those variables are intentionally injected into the container for terminal commands. This is useful for task-specific credentials like `GITHUB_TOKEN`, but it also means code running in the container can read and exfiltrate them.
|
||||
:::
|
||||
|
||||
## Terminal Backend Security Comparison
|
||||
|
||||
| Backend | Isolation | Dangerous Cmd Check | Best For |
|
||||
|---------|-----------|-------------------|----------|
|
||||
| **local** | None — runs on host | ✅ Yes | Development, trusted users |
|
||||
| **ssh** | Remote machine | ✅ Yes | Running on a separate server |
|
||||
| **docker** | Container | ❌ Skipped (container is boundary) | Production gateway |
|
||||
| **singularity** | Container | ❌ Skipped | HPC environments |
|
||||
| **modal** | Cloud sandbox | ❌ Skipped | Scalable cloud isolation |
|
||||
| **daytona** | Cloud sandbox | ❌ Skipped | Persistent cloud workspaces |
|
||||
|
||||
## Environment Variable Passthrough {#environment-variable-passthrough}
|
||||
|
||||
Both `execute_code` and `terminal` strip sensitive environment variables from child processes to prevent credential exfiltration by LLM-generated code. However, skills that declare `required_environment_variables` legitimately need access to those vars.
|
||||
|
||||
### How It Works
|
||||
|
||||
Two mechanisms allow specific variables through the sandbox filters:
|
||||
|
||||
**1. Skill-scoped passthrough (automatic)**
|
||||
|
||||
When a skill is loaded (via `skill_view` or the `/skill` command) and declares `required_environment_variables`, any of those vars that are actually set in the environment are automatically registered as passthrough. Missing vars (still in setup-needed state) are **not** registered.
|
||||
|
||||
```yaml
|
||||
# In a skill's SKILL.md frontmatter
|
||||
required_environment_variables:
|
||||
- name: TENOR_API_KEY
|
||||
prompt: Tenor API key
|
||||
help: Get a key from https://developers.google.com/tenor
|
||||
```
|
||||
|
||||
After loading this skill, `TENOR_API_KEY` passes through to both `execute_code` and `terminal` subprocesses — no manual configuration needed.
|
||||
|
||||
**2. Config-based passthrough (manual)**
|
||||
|
||||
For env vars not declared by any skill, add them to `terminal.env_passthrough` in `config.yaml`:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
env_passthrough:
|
||||
- MY_CUSTOM_KEY
|
||||
- ANOTHER_TOKEN
|
||||
```
|
||||
|
||||
### What Each Sandbox Filters
|
||||
|
||||
| Sandbox | Default Filter | Passthrough Override |
|
||||
|---------|---------------|---------------------|
|
||||
| **execute_code** | Blocks vars containing `KEY`, `TOKEN`, `SECRET`, `PASSWORD`, `CREDENTIAL`, `PASSWD`, `AUTH` in name; only allows safe-prefix vars through | ✅ Passthrough vars bypass both checks |
|
||||
| **terminal** (local) | Blocks explicit Hermes infrastructure vars (provider keys, gateway tokens, tool API keys) | ✅ Passthrough vars bypass the blocklist |
|
||||
| **MCP** | Blocks everything except safe system vars + explicitly configured `env` | ❌ Not affected by passthrough (use MCP `env` config instead) |
|
||||
|
||||
### Security Considerations
|
||||
|
||||
- The passthrough only affects vars you or your skills explicitly declare — the default security posture is unchanged for arbitrary LLM-generated code
|
||||
- Skills Guard scans skill content for suspicious env access patterns before installation
|
||||
- Missing/unset vars are never registered (you can't leak what doesn't exist)
|
||||
- Hermes infrastructure secrets (provider API keys, gateway tokens) should never be added to `env_passthrough` — they have dedicated mechanisms
|
||||
|
||||
## MCP Credential Handling
|
||||
|
||||
MCP (Model Context Protocol) server subprocesses receive a **filtered environment** to prevent accidental credential leakage.
|
||||
|
||||
### Safe Environment Variables
|
||||
|
||||
Only these variables are passed through from the host to MCP stdio subprocesses:
|
||||
|
||||
```
|
||||
PATH, HOME, USER, LANG, LC_ALL, TERM, SHELL, TMPDIR
|
||||
```
|
||||
|
||||
Plus any `XDG_*` variables. All other environment variables (API keys, tokens, secrets) are **stripped**.
|
||||
|
||||
Variables explicitly defined in the MCP server's `env` config are passed through:
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_..." # Only this is passed
|
||||
```
|
||||
|
||||
### Credential Redaction
|
||||
|
||||
Error messages from MCP tools are sanitized before being returned to the LLM. The following patterns are replaced with `[REDACTED]`:
|
||||
|
||||
- GitHub PATs (`ghp_...`)
|
||||
- OpenAI-style keys (`sk-...`)
|
||||
- Bearer tokens
|
||||
- `token=`, `key=`, `API_KEY=`, `password=`, `secret=` parameters
|
||||
|
||||
### Website Access Policy
|
||||
|
||||
You can restrict which websites the agent can access through its web and browser tools. This is useful for preventing the agent from accessing internal services, admin panels, or other sensitive URLs.
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
security:
|
||||
website_blocklist:
|
||||
enabled: true
|
||||
domains:
|
||||
- "*.internal.company.com"
|
||||
- "admin.example.com"
|
||||
shared_files:
|
||||
- "/etc/hermes/blocked-sites.txt"
|
||||
```
|
||||
|
||||
When a blocked URL is requested, the tool returns an error explaining the domain is blocked by policy. The blocklist is enforced across `web_search`, `web_extract`, `browser_navigate`, and all URL-capable tools.
|
||||
|
||||
See [Website Blocklist](/docs/user-guide/configuration#website-blocklist) in the configuration guide for full details.
|
||||
|
||||
### SSRF Protection
|
||||
|
||||
All URL-capable tools (web search, web extract, vision, browser) validate URLs before fetching them to prevent Server-Side Request Forgery (SSRF) attacks. Blocked addresses include:
|
||||
|
||||
- **Private networks** (RFC 1918): `10.0.0.0/8`, `172.16.0.0/12`, `192.168.0.0/16`
|
||||
- **Loopback**: `127.0.0.0/8`, `::1`
|
||||
- **Link-local**: `169.254.0.0/16` (includes cloud metadata at `169.254.169.254`)
|
||||
- **CGNAT / shared address space** (RFC 6598): `100.64.0.0/10` (Tailscale, WireGuard VPNs)
|
||||
- **Cloud metadata hostnames**: `metadata.google.internal`, `metadata.goog`
|
||||
- **Reserved, multicast, and unspecified addresses**
|
||||
|
||||
SSRF protection is always active and cannot be disabled. DNS failures are treated as blocked (fail-closed). Redirect chains are re-validated at each hop to prevent redirect-based bypasses.
|
||||
|
||||
### Tirith Pre-Exec Security Scanning
|
||||
|
||||
Hermes integrates [tirith](https://github.com/sheeki03/tirith) for content-level command scanning before execution. Tirith detects threats that pattern matching alone misses:
|
||||
|
||||
- Homograph URL spoofing (internationalized domain attacks)
|
||||
- Pipe-to-interpreter patterns (`curl | bash`, `wget | sh`)
|
||||
- Terminal injection attacks
|
||||
|
||||
Tirith auto-installs from GitHub releases on first use with SHA-256 checksum verification (and cosign provenance verification if cosign is available).
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
security:
|
||||
tirith_enabled: true # Enable/disable tirith scanning (default: true)
|
||||
tirith_path: "tirith" # Path to tirith binary (default: PATH lookup)
|
||||
tirith_timeout: 5 # Subprocess timeout in seconds
|
||||
tirith_fail_open: true # Allow execution when tirith is unavailable (default: true)
|
||||
```
|
||||
|
||||
When `tirith_fail_open` is `true` (default), commands proceed if tirith is not installed or times out. Set to `false` in high-security environments to block commands when tirith is unavailable.
|
||||
|
||||
Tirith's verdict integrates with the approval flow: safe commands pass through, suspicious commands trigger user approval, and dangerous commands are blocked.
|
||||
|
||||
### Context File Injection Protection
|
||||
|
||||
Context files (AGENTS.md, .cursorrules, SOUL.md) are scanned for prompt injection before being included in the system prompt. The scanner checks for:
|
||||
|
||||
- Instructions to ignore/disregard prior instructions
|
||||
- Hidden HTML comments with suspicious keywords
|
||||
- Attempts to read secrets (`.env`, `credentials`, `.netrc`)
|
||||
- Credential exfiltration via `curl`
|
||||
- Invisible Unicode characters (zero-width spaces, bidirectional overrides)
|
||||
|
||||
Blocked files show a warning:
|
||||
|
||||
```
|
||||
[BLOCKED: AGENTS.md contained potential prompt injection (prompt_injection). Content not loaded.]
|
||||
```
|
||||
|
||||
## Best Practices for Production Deployment
|
||||
|
||||
### Gateway Deployment Checklist
|
||||
|
||||
1. **Set explicit allowlists** — never use `GATEWAY_ALLOW_ALL_USERS=true` in production
|
||||
2. **Use container backend** — set `terminal.backend: docker` in config.yaml
|
||||
3. **Restrict resource limits** — set appropriate CPU, memory, and disk limits
|
||||
4. **Store secrets securely** — keep API keys in `~/.hermes/.env` with proper file permissions
|
||||
5. **Enable DM pairing** — use pairing codes instead of hardcoding user IDs when possible
|
||||
6. **Review command allowlist** — periodically audit `command_allowlist` in config.yaml
|
||||
7. **Set `MESSAGING_CWD`** — don't let the agent operate from sensitive directories
|
||||
8. **Run as non-root** — never run the gateway as root
|
||||
9. **Monitor logs** — check `~/.hermes/logs/` for unauthorized access attempts
|
||||
10. **Keep updated** — run `hermes update` regularly for security patches
|
||||
|
||||
### Securing API Keys
|
||||
|
||||
```bash
|
||||
# Set proper permissions on the .env file
|
||||
chmod 600 ~/.hermes/.env
|
||||
|
||||
# Keep separate keys for different services
|
||||
# Never commit .env files to version control
|
||||
```
|
||||
|
||||
### Network Isolation
|
||||
|
||||
For maximum security, run the gateway on a separate machine or VM:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: ssh
|
||||
ssh_host: "agent-worker.local"
|
||||
ssh_user: "hermes"
|
||||
ssh_key: "~/.ssh/hermes_agent_key"
|
||||
```
|
||||
|
||||
This keeps the gateway's messaging connections separate from the agent's command execution.
|
||||
390
hermes_code/website/docs/user-guide/sessions.md
Normal file
390
hermes_code/website/docs/user-guide/sessions.md
Normal file
|
|
@ -0,0 +1,390 @@
|
|||
---
|
||||
sidebar_position: 7
|
||||
title: "Sessions"
|
||||
description: "Session persistence, resume, search, management, and per-platform session tracking"
|
||||
---
|
||||
|
||||
# Sessions
|
||||
|
||||
Hermes Agent automatically saves every conversation as a session. Sessions enable conversation resume, cross-session search, and full conversation history management.
|
||||
|
||||
## How Sessions Work
|
||||
|
||||
Every conversation — whether from the CLI, Telegram, Discord, WhatsApp, or Slack — is stored as a session with full message history. Sessions are tracked in two complementary systems:
|
||||
|
||||
1. **SQLite database** (`~/.hermes/state.db`) — structured session metadata with FTS5 full-text search
|
||||
2. **JSONL transcripts** (`~/.hermes/sessions/`) — raw conversation transcripts including tool calls (gateway)
|
||||
|
||||
The SQLite database stores:
|
||||
- Session ID, source platform, user ID
|
||||
- **Session title** (unique, human-readable name)
|
||||
- Model name and configuration
|
||||
- System prompt snapshot
|
||||
- Full message history (role, content, tool calls, tool results)
|
||||
- Token counts (input/output)
|
||||
- Timestamps (started_at, ended_at)
|
||||
- Parent session ID (for compression-triggered session splitting)
|
||||
|
||||
### Session Sources
|
||||
|
||||
Each session is tagged with its source platform:
|
||||
|
||||
| Source | Description |
|
||||
|--------|-------------|
|
||||
| `cli` | Interactive CLI (`hermes` or `hermes chat`) |
|
||||
| `telegram` | Telegram messenger |
|
||||
| `discord` | Discord server/DM |
|
||||
| `whatsapp` | WhatsApp messenger |
|
||||
| `slack` | Slack workspace |
|
||||
|
||||
## CLI Session Resume
|
||||
|
||||
Resume previous conversations from the CLI using `--continue` or `--resume`:
|
||||
|
||||
### Continue Last Session
|
||||
|
||||
```bash
|
||||
# Resume the most recent CLI session
|
||||
hermes --continue
|
||||
hermes -c
|
||||
|
||||
# Or with the chat subcommand
|
||||
hermes chat --continue
|
||||
hermes chat -c
|
||||
```
|
||||
|
||||
This looks up the most recent `cli` session from the SQLite database and loads its full conversation history.
|
||||
|
||||
### Resume by Name
|
||||
|
||||
If you've given a session a title (see [Session Naming](#session-naming) below), you can resume it by name:
|
||||
|
||||
```bash
|
||||
# Resume a named session
|
||||
hermes -c "my project"
|
||||
|
||||
# If there are lineage variants (my project, my project #2, my project #3),
|
||||
# this automatically resumes the most recent one
|
||||
hermes -c "my project" # → resumes "my project #3"
|
||||
```
|
||||
|
||||
### Resume Specific Session
|
||||
|
||||
```bash
|
||||
# Resume a specific session by ID
|
||||
hermes --resume 20250305_091523_a1b2c3d4
|
||||
hermes -r 20250305_091523_a1b2c3d4
|
||||
|
||||
# Resume by title
|
||||
hermes --resume "refactoring auth"
|
||||
|
||||
# Or with the chat subcommand
|
||||
hermes chat --resume 20250305_091523_a1b2c3d4
|
||||
```
|
||||
|
||||
Session IDs are shown when you exit a CLI session, and can be found with `hermes sessions list`.
|
||||
|
||||
### Conversation Recap on Resume
|
||||
|
||||
When you resume a session, Hermes displays a compact recap of the previous conversation in a styled panel before the input prompt:
|
||||
|
||||
<img className="docs-terminal-figure" src="/img/docs/session-recap.svg" alt="Stylized preview of the Previous Conversation recap panel shown when resuming a Hermes session." />
|
||||
<p className="docs-figure-caption">Resume mode shows a compact recap panel with recent user and assistant turns before returning you to the live prompt.</p>
|
||||
|
||||
The recap:
|
||||
- Shows **user messages** (gold `●`) and **assistant responses** (green `◆`)
|
||||
- **Truncates** long messages (300 chars for user, 200 chars / 3 lines for assistant)
|
||||
- **Collapses tool calls** to a count with tool names (e.g., `[3 tool calls: terminal, web_search]`)
|
||||
- **Hides** system messages, tool results, and internal reasoning
|
||||
- **Caps** at the last 10 exchanges with a "... N earlier messages ..." indicator
|
||||
- Uses **dim styling** to distinguish from the active conversation
|
||||
|
||||
To disable the recap and keep the minimal one-liner behavior, set in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
display:
|
||||
resume_display: minimal # default: full
|
||||
```
|
||||
|
||||
:::tip
|
||||
Session IDs follow the format `YYYYMMDD_HHMMSS_<8-char-hex>`, e.g. `20250305_091523_a1b2c3d4`. You can resume by ID or by title — both work with `-c` and `-r`.
|
||||
:::
|
||||
|
||||
## Session Naming
|
||||
|
||||
Give sessions human-readable titles so you can find and resume them easily.
|
||||
|
||||
### Auto-Generated Titles
|
||||
|
||||
Hermes automatically generates a short descriptive title (3–7 words) for each session after the first exchange. This runs in a background thread using a fast auxiliary model, so it adds no latency. You'll see auto-generated titles when browsing sessions with `hermes sessions list` or `hermes sessions browse`.
|
||||
|
||||
Auto-titling only fires once per session and is skipped if you've already set a title manually.
|
||||
|
||||
### Setting a Title Manually
|
||||
|
||||
Use the `/title` slash command inside any chat session (CLI or gateway):
|
||||
|
||||
```
|
||||
/title my research project
|
||||
```
|
||||
|
||||
The title is applied immediately. If the session hasn't been created in the database yet (e.g., you run `/title` before sending your first message), it's queued and applied once the session starts.
|
||||
|
||||
You can also rename existing sessions from the command line:
|
||||
|
||||
```bash
|
||||
hermes sessions rename 20250305_091523_a1b2c3d4 "refactoring auth module"
|
||||
```
|
||||
|
||||
### Title Rules
|
||||
|
||||
- **Unique** — no two sessions can share the same title
|
||||
- **Max 100 characters** — keeps listing output clean
|
||||
- **Sanitized** — control characters, zero-width chars, and RTL overrides are stripped automatically
|
||||
- **Normal Unicode is fine** — emoji, CJK, accented characters all work
|
||||
|
||||
### Auto-Lineage on Compression
|
||||
|
||||
When a session's context is compressed (manually via `/compress` or automatically), Hermes creates a new continuation session. If the original had a title, the new session automatically gets a numbered title:
|
||||
|
||||
```
|
||||
"my project" → "my project #2" → "my project #3"
|
||||
```
|
||||
|
||||
When you resume by name (`hermes -c "my project"`), it automatically picks the most recent session in the lineage.
|
||||
|
||||
### /title in Messaging Platforms
|
||||
|
||||
The `/title` command works in all gateway platforms (Telegram, Discord, Slack, WhatsApp):
|
||||
|
||||
- `/title My Research` — set the session title
|
||||
- `/title` — show the current title
|
||||
|
||||
## Session Management Commands
|
||||
|
||||
Hermes provides a full set of session management commands via `hermes sessions`:
|
||||
|
||||
### List Sessions
|
||||
|
||||
```bash
|
||||
# List recent sessions (default: last 20)
|
||||
hermes sessions list
|
||||
|
||||
# Filter by platform
|
||||
hermes sessions list --source telegram
|
||||
|
||||
# Show more sessions
|
||||
hermes sessions list --limit 50
|
||||
```
|
||||
|
||||
When sessions have titles, the output shows titles, previews, and relative timestamps:
|
||||
|
||||
```
|
||||
Title Preview Last Active ID
|
||||
────────────────────────────────────────────────────────────────────────────────────────────────
|
||||
refactoring auth Help me refactor the auth module please 2h ago 20250305_091523_a
|
||||
my project #3 Can you check the test failures? yesterday 20250304_143022_e
|
||||
— What's the weather in Las Vegas? 3d ago 20250303_101500_f
|
||||
```
|
||||
|
||||
When no sessions have titles, a simpler format is used:
|
||||
|
||||
```
|
||||
Preview Last Active Src ID
|
||||
──────────────────────────────────────────────────────────────────────────────────────
|
||||
Help me refactor the auth module please 2h ago cli 20250305_091523_a
|
||||
What's the weather in Las Vegas? 3d ago tele 20250303_101500_f
|
||||
```
|
||||
|
||||
### Export Sessions
|
||||
|
||||
```bash
|
||||
# Export all sessions to a JSONL file
|
||||
hermes sessions export backup.jsonl
|
||||
|
||||
# Export sessions from a specific platform
|
||||
hermes sessions export telegram-history.jsonl --source telegram
|
||||
|
||||
# Export a single session
|
||||
hermes sessions export session.jsonl --session-id 20250305_091523_a1b2c3d4
|
||||
```
|
||||
|
||||
Exported files contain one JSON object per line with full session metadata and all messages.
|
||||
|
||||
### Delete a Session
|
||||
|
||||
```bash
|
||||
# Delete a specific session (with confirmation)
|
||||
hermes sessions delete 20250305_091523_a1b2c3d4
|
||||
|
||||
# Delete without confirmation
|
||||
hermes sessions delete 20250305_091523_a1b2c3d4 --yes
|
||||
```
|
||||
|
||||
### Rename a Session
|
||||
|
||||
```bash
|
||||
# Set or change a session's title
|
||||
hermes sessions rename 20250305_091523_a1b2c3d4 "debugging auth flow"
|
||||
|
||||
# Multi-word titles don't need quotes in the CLI
|
||||
hermes sessions rename 20250305_091523_a1b2c3d4 debugging auth flow
|
||||
```
|
||||
|
||||
If the title is already in use by another session, an error is shown.
|
||||
|
||||
### Prune Old Sessions
|
||||
|
||||
```bash
|
||||
# Delete ended sessions older than 90 days (default)
|
||||
hermes sessions prune
|
||||
|
||||
# Custom age threshold
|
||||
hermes sessions prune --older-than 30
|
||||
|
||||
# Only prune sessions from a specific platform
|
||||
hermes sessions prune --source telegram --older-than 60
|
||||
|
||||
# Skip confirmation
|
||||
hermes sessions prune --older-than 30 --yes
|
||||
```
|
||||
|
||||
:::info
|
||||
Pruning only deletes **ended** sessions (sessions that have been explicitly ended or auto-reset). Active sessions are never pruned.
|
||||
:::
|
||||
|
||||
### Session Statistics
|
||||
|
||||
```bash
|
||||
hermes sessions stats
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```
|
||||
Total sessions: 142
|
||||
Total messages: 3847
|
||||
cli: 89 sessions
|
||||
telegram: 38 sessions
|
||||
discord: 15 sessions
|
||||
Database size: 12.4 MB
|
||||
```
|
||||
|
||||
For deeper analytics — token usage, cost estimates, tool breakdown, and activity patterns — use [`hermes insights`](/docs/reference/cli-commands#hermes-insights).
|
||||
|
||||
## Session Search Tool
|
||||
|
||||
The agent has a built-in `session_search` tool that performs full-text search across all past conversations using SQLite's FTS5 engine.
|
||||
|
||||
### How It Works
|
||||
|
||||
1. FTS5 searches matching messages ranked by relevance
|
||||
2. Groups results by session, takes the top N unique sessions (default 3)
|
||||
3. Loads each session's conversation, truncates to ~100K chars centered on matches
|
||||
4. Sends to a fast summarization model for focused summaries
|
||||
5. Returns per-session summaries with metadata and surrounding context
|
||||
|
||||
### FTS5 Query Syntax
|
||||
|
||||
The search supports standard FTS5 query syntax:
|
||||
|
||||
- Simple keywords: `docker deployment`
|
||||
- Phrases: `"exact phrase"`
|
||||
- Boolean: `docker OR kubernetes`, `python NOT java`
|
||||
- Prefix: `deploy*`
|
||||
|
||||
### When It's Used
|
||||
|
||||
The agent is prompted to use session search automatically:
|
||||
|
||||
> *"When the user references something from a past conversation or you suspect relevant prior context exists, use session_search to recall it before asking them to repeat themselves."*
|
||||
|
||||
## Per-Platform Session Tracking
|
||||
|
||||
### Gateway Sessions
|
||||
|
||||
On messaging platforms, sessions are keyed by a deterministic session key built from the message source:
|
||||
|
||||
| Chat Type | Default Key Format | Behavior |
|
||||
|-----------|--------------------|----------|
|
||||
| Telegram DM | `agent:main:telegram:dm:<chat_id>` | One session per DM chat |
|
||||
| Discord DM | `agent:main:discord:dm:<chat_id>` | One session per DM chat |
|
||||
| WhatsApp DM | `agent:main:whatsapp:dm:<chat_id>` | One session per DM chat |
|
||||
| Group chat | `agent:main:<platform>:group:<chat_id>:<user_id>` | Per-user inside the group when the platform exposes a user ID |
|
||||
| Group thread/topic | `agent:main:<platform>:group:<chat_id>:<thread_id>:<user_id>` | Per-user inside that thread/topic |
|
||||
| Channel | `agent:main:<platform>:channel:<chat_id>:<user_id>` | Per-user inside the channel when the platform exposes a user ID |
|
||||
|
||||
When Hermes cannot get a participant identifier for a shared chat, it falls back to one shared session for that room.
|
||||
|
||||
### Shared vs Isolated Group Sessions
|
||||
|
||||
By default, Hermes uses `group_sessions_per_user: true` in `config.yaml`. That means:
|
||||
|
||||
- Alice and Bob can both talk to Hermes in the same Discord channel without sharing transcript history
|
||||
- one user's long tool-heavy task does not pollute another user's context window
|
||||
- interrupt handling also stays per-user because the running-agent key matches the isolated session key
|
||||
|
||||
If you want one shared "room brain" instead, set:
|
||||
|
||||
```yaml
|
||||
group_sessions_per_user: false
|
||||
```
|
||||
|
||||
That reverts groups/channels to a single shared session per room, which preserves shared conversational context but also shares token costs, interrupt state, and context growth.
|
||||
|
||||
### Session Reset Policies
|
||||
|
||||
Gateway sessions are automatically reset based on configurable policies:
|
||||
|
||||
- **idle** — reset after N minutes of inactivity
|
||||
- **daily** — reset at a specific hour each day
|
||||
- **both** — reset on whichever comes first (idle or daily)
|
||||
- **none** — never auto-reset
|
||||
|
||||
Before a session is auto-reset, the agent is given a turn to save any important memories or skills from the conversation.
|
||||
|
||||
Sessions with **active background processes** are never auto-reset, regardless of policy.
|
||||
|
||||
## Storage Locations
|
||||
|
||||
| What | Path | Description |
|
||||
|------|------|-------------|
|
||||
| SQLite database | `~/.hermes/state.db` | All session metadata + messages with FTS5 |
|
||||
| Gateway transcripts | `~/.hermes/sessions/` | JSONL transcripts per session + sessions.json index |
|
||||
| Gateway index | `~/.hermes/sessions/sessions.json` | Maps session keys to active session IDs |
|
||||
|
||||
The SQLite database uses WAL mode for concurrent readers and a single writer, which suits the gateway's multi-platform architecture well.
|
||||
|
||||
### Database Schema
|
||||
|
||||
Key tables in `state.db`:
|
||||
|
||||
- **sessions** — session metadata (id, source, user_id, model, title, timestamps, token counts). Titles have a unique index (NULL titles allowed, only non-NULL must be unique).
|
||||
- **messages** — full message history (role, content, tool_calls, tool_name, token_count)
|
||||
- **messages_fts** — FTS5 virtual table for full-text search across message content
|
||||
|
||||
## Session Expiry and Cleanup
|
||||
|
||||
### Automatic Cleanup
|
||||
|
||||
- Gateway sessions auto-reset based on the configured reset policy
|
||||
- Before reset, the agent saves memories and skills from the expiring session
|
||||
- Ended sessions remain in the database until pruned
|
||||
|
||||
### Manual Cleanup
|
||||
|
||||
```bash
|
||||
# Prune sessions older than 90 days
|
||||
hermes sessions prune
|
||||
|
||||
# Delete a specific session
|
||||
hermes sessions delete <session_id>
|
||||
|
||||
# Export before pruning (backup)
|
||||
hermes sessions export backup.jsonl
|
||||
hermes sessions prune --older-than 30 --yes
|
||||
```
|
||||
|
||||
:::tip
|
||||
The database grows slowly (typical: 10-15 MB for hundreds of sessions). Pruning is mainly useful for removing old conversations you no longer need for search recall.
|
||||
:::
|
||||
139
hermes_code/website/docusaurus.config.ts
Normal file
139
hermes_code/website/docusaurus.config.ts
Normal file
|
|
@ -0,0 +1,139 @@
|
|||
import {themes as prismThemes} from 'prism-react-renderer';
|
||||
import type {Config} from '@docusaurus/types';
|
||||
import type * as Preset from '@docusaurus/preset-classic';
|
||||
|
||||
const config: Config = {
|
||||
title: 'Hermes Agent',
|
||||
tagline: 'The self-improving AI agent',
|
||||
favicon: 'img/favicon.ico',
|
||||
|
||||
url: 'https://hermes-agent.nousresearch.com',
|
||||
baseUrl: '/docs/',
|
||||
|
||||
organizationName: 'NousResearch',
|
||||
projectName: 'hermes-agent',
|
||||
|
||||
onBrokenLinks: 'warn',
|
||||
|
||||
markdown: {
|
||||
mermaid: true,
|
||||
hooks: {
|
||||
onBrokenMarkdownLinks: 'warn',
|
||||
},
|
||||
},
|
||||
|
||||
i18n: {
|
||||
defaultLocale: 'en',
|
||||
locales: ['en'],
|
||||
},
|
||||
|
||||
themes: [
|
||||
'@docusaurus/theme-mermaid',
|
||||
[
|
||||
require.resolve('@easyops-cn/docusaurus-search-local'),
|
||||
/** @type {import("@easyops-cn/docusaurus-search-local").PluginOptions} */
|
||||
({
|
||||
hashed: true,
|
||||
language: ['en'],
|
||||
indexBlog: false,
|
||||
docsRouteBasePath: '/',
|
||||
highlightSearchTermsOnTargetPage: true,
|
||||
}),
|
||||
],
|
||||
],
|
||||
|
||||
presets: [
|
||||
[
|
||||
'classic',
|
||||
{
|
||||
docs: {
|
||||
routeBasePath: '/', // Docs at the root of /docs/
|
||||
sidebarPath: './sidebars.ts',
|
||||
editUrl: 'https://github.com/NousResearch/hermes-agent/edit/main/website/',
|
||||
},
|
||||
blog: false,
|
||||
theme: {
|
||||
customCss: './src/css/custom.css',
|
||||
},
|
||||
} satisfies Preset.Options,
|
||||
],
|
||||
],
|
||||
|
||||
themeConfig: {
|
||||
image: 'img/hermes-agent-banner.png',
|
||||
colorMode: {
|
||||
defaultMode: 'dark',
|
||||
respectPrefersColorScheme: true,
|
||||
},
|
||||
navbar: {
|
||||
title: 'Hermes Agent',
|
||||
logo: {
|
||||
alt: 'Hermes Agent',
|
||||
src: 'img/logo.png',
|
||||
},
|
||||
items: [
|
||||
{
|
||||
type: 'docSidebar',
|
||||
sidebarId: 'docs',
|
||||
position: 'left',
|
||||
label: 'Docs',
|
||||
},
|
||||
{
|
||||
href: 'https://hermes-agent.nousresearch.com',
|
||||
label: 'Home',
|
||||
position: 'right',
|
||||
},
|
||||
{
|
||||
href: 'https://github.com/NousResearch/hermes-agent',
|
||||
label: 'GitHub',
|
||||
position: 'right',
|
||||
},
|
||||
{
|
||||
href: 'https://discord.gg/NousResearch',
|
||||
label: 'Discord',
|
||||
position: 'right',
|
||||
},
|
||||
],
|
||||
},
|
||||
footer: {
|
||||
style: 'dark',
|
||||
links: [
|
||||
{
|
||||
title: 'Docs',
|
||||
items: [
|
||||
{ label: 'Getting Started', to: '/getting-started/quickstart' },
|
||||
{ label: 'User Guide', to: '/user-guide/cli' },
|
||||
{ label: 'Developer Guide', to: '/developer-guide/architecture' },
|
||||
{ label: 'Reference', to: '/reference/cli-commands' },
|
||||
],
|
||||
},
|
||||
{
|
||||
title: 'Community',
|
||||
items: [
|
||||
{ label: 'Discord', href: 'https://discord.gg/NousResearch' },
|
||||
{ label: 'GitHub Discussions', href: 'https://github.com/NousResearch/hermes-agent/discussions' },
|
||||
{ label: 'Skills Hub', href: 'https://agentskills.io' },
|
||||
],
|
||||
},
|
||||
{
|
||||
title: 'More',
|
||||
items: [
|
||||
{ label: 'GitHub', href: 'https://github.com/NousResearch/hermes-agent' },
|
||||
{ label: 'Nous Research', href: 'https://nousresearch.com' },
|
||||
],
|
||||
},
|
||||
],
|
||||
copyright: `Built by <a href="https://nousresearch.com">Nous Research</a> · MIT License · ${new Date().getFullYear()}`,
|
||||
},
|
||||
prism: {
|
||||
theme: prismThemes.github,
|
||||
darkTheme: prismThemes.dracula,
|
||||
additionalLanguages: ['bash', 'yaml', 'json', 'python', 'toml'],
|
||||
},
|
||||
mermaid: {
|
||||
theme: {light: 'neutral', dark: 'dark'},
|
||||
},
|
||||
} satisfies Preset.ThemeConfig,
|
||||
};
|
||||
|
||||
export default config;
|
||||
20255
hermes_code/website/package-lock.json
generated
Normal file
20255
hermes_code/website/package-lock.json
generated
Normal file
File diff suppressed because it is too large
Load diff
50
hermes_code/website/package.json
Normal file
50
hermes_code/website/package.json
Normal file
|
|
@ -0,0 +1,50 @@
|
|||
{
|
||||
"name": "website",
|
||||
"version": "0.0.0",
|
||||
"private": true,
|
||||
"scripts": {
|
||||
"docusaurus": "docusaurus",
|
||||
"start": "docusaurus start",
|
||||
"build": "docusaurus build",
|
||||
"swizzle": "docusaurus swizzle",
|
||||
"deploy": "docusaurus deploy",
|
||||
"clear": "docusaurus clear",
|
||||
"serve": "docusaurus serve",
|
||||
"write-translations": "docusaurus write-translations",
|
||||
"write-heading-ids": "docusaurus write-heading-ids",
|
||||
"typecheck": "tsc",
|
||||
"lint:diagrams": "ascii-guard lint docs"
|
||||
},
|
||||
"dependencies": {
|
||||
"@docusaurus/core": "3.9.2",
|
||||
"@docusaurus/preset-classic": "3.9.2",
|
||||
"@docusaurus/theme-mermaid": "^3.9.2",
|
||||
"@easyops-cn/docusaurus-search-local": "^0.55.1",
|
||||
"@mdx-js/react": "^3.0.0",
|
||||
"clsx": "^2.0.0",
|
||||
"prism-react-renderer": "^2.3.0",
|
||||
"react": "^19.0.0",
|
||||
"react-dom": "^19.0.0"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@docusaurus/module-type-aliases": "3.9.2",
|
||||
"@docusaurus/tsconfig": "3.9.2",
|
||||
"@docusaurus/types": "3.9.2",
|
||||
"typescript": "~5.6.2"
|
||||
},
|
||||
"browserslist": {
|
||||
"production": [
|
||||
">0.5%",
|
||||
"not dead",
|
||||
"not op_mini all"
|
||||
],
|
||||
"development": [
|
||||
"last 3 chrome version",
|
||||
"last 3 firefox version",
|
||||
"last 5 safari version"
|
||||
]
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=20.0"
|
||||
}
|
||||
}
|
||||
154
hermes_code/website/sidebars.ts
Normal file
154
hermes_code/website/sidebars.ts
Normal file
|
|
@ -0,0 +1,154 @@
|
|||
import type {SidebarsConfig} from '@docusaurus/plugin-content-docs';
|
||||
|
||||
const sidebars: SidebarsConfig = {
|
||||
docs: [
|
||||
{
|
||||
type: 'category',
|
||||
label: 'Getting Started',
|
||||
collapsed: false,
|
||||
items: [
|
||||
'getting-started/quickstart',
|
||||
'getting-started/installation',
|
||||
'getting-started/updating',
|
||||
'getting-started/learning-path',
|
||||
],
|
||||
},
|
||||
{
|
||||
type: 'category',
|
||||
label: 'Guides & Tutorials',
|
||||
collapsed: false,
|
||||
items: [
|
||||
'guides/tips',
|
||||
'guides/daily-briefing-bot',
|
||||
'guides/team-telegram-assistant',
|
||||
'guides/python-library',
|
||||
'guides/use-mcp-with-hermes',
|
||||
'guides/use-soul-with-hermes',
|
||||
'guides/use-voice-mode-with-hermes',
|
||||
],
|
||||
},
|
||||
{
|
||||
type: 'category',
|
||||
label: 'User Guide',
|
||||
collapsed: false,
|
||||
items: [
|
||||
'user-guide/cli',
|
||||
'user-guide/configuration',
|
||||
'user-guide/sessions',
|
||||
'user-guide/security',
|
||||
{
|
||||
type: 'category',
|
||||
label: 'Messaging Gateway',
|
||||
items: [
|
||||
'user-guide/messaging/index',
|
||||
'user-guide/messaging/telegram',
|
||||
'user-guide/messaging/discord',
|
||||
'user-guide/messaging/slack',
|
||||
'user-guide/messaging/whatsapp',
|
||||
'user-guide/messaging/signal',
|
||||
'user-guide/messaging/email',
|
||||
'user-guide/messaging/homeassistant',
|
||||
'user-guide/messaging/mattermost',
|
||||
'user-guide/messaging/matrix',
|
||||
'user-guide/messaging/dingtalk',
|
||||
'user-guide/messaging/open-webui',
|
||||
'user-guide/messaging/webhooks',
|
||||
],
|
||||
},
|
||||
{
|
||||
type: 'category',
|
||||
label: 'Core Features',
|
||||
items: [
|
||||
'user-guide/features/tools',
|
||||
'user-guide/features/skills',
|
||||
'user-guide/features/memory',
|
||||
'user-guide/features/context-files',
|
||||
'user-guide/features/personality',
|
||||
'user-guide/features/skins',
|
||||
],
|
||||
},
|
||||
{
|
||||
type: 'category',
|
||||
label: 'Automation',
|
||||
items: [
|
||||
'user-guide/features/cron',
|
||||
'user-guide/features/delegation',
|
||||
'user-guide/features/code-execution',
|
||||
'user-guide/features/hooks',
|
||||
],
|
||||
},
|
||||
{
|
||||
type: 'category',
|
||||
label: 'Web & Media',
|
||||
items: [
|
||||
'user-guide/features/voice-mode',
|
||||
'user-guide/features/browser',
|
||||
'user-guide/features/vision',
|
||||
'user-guide/features/image-generation',
|
||||
'user-guide/features/tts',
|
||||
],
|
||||
},
|
||||
{
|
||||
type: 'category',
|
||||
label: 'Integrations',
|
||||
items: [
|
||||
'user-guide/features/api-server',
|
||||
'user-guide/features/acp',
|
||||
'user-guide/features/mcp',
|
||||
'user-guide/features/honcho',
|
||||
'user-guide/features/provider-routing',
|
||||
'user-guide/features/fallback-providers',
|
||||
],
|
||||
},
|
||||
{
|
||||
type: 'category',
|
||||
label: 'Advanced',
|
||||
items: [
|
||||
'user-guide/features/batch-processing',
|
||||
'user-guide/features/rl-training',
|
||||
],
|
||||
},
|
||||
],
|
||||
},
|
||||
{
|
||||
type: 'category',
|
||||
label: 'Developer Guide',
|
||||
items: [
|
||||
'developer-guide/architecture',
|
||||
'developer-guide/agent-loop',
|
||||
'developer-guide/provider-runtime',
|
||||
'developer-guide/adding-providers',
|
||||
'developer-guide/prompt-assembly',
|
||||
'developer-guide/context-compression-and-caching',
|
||||
'developer-guide/gateway-internals',
|
||||
'developer-guide/session-storage',
|
||||
'developer-guide/tools-runtime',
|
||||
'developer-guide/acp-internals',
|
||||
'developer-guide/trajectory-format',
|
||||
'developer-guide/cron-internals',
|
||||
'developer-guide/environments',
|
||||
'developer-guide/adding-tools',
|
||||
'developer-guide/creating-skills',
|
||||
'developer-guide/extending-the-cli',
|
||||
'developer-guide/contributing',
|
||||
],
|
||||
},
|
||||
{
|
||||
type: 'category',
|
||||
label: 'Reference',
|
||||
items: [
|
||||
'reference/cli-commands',
|
||||
'reference/slash-commands',
|
||||
'reference/tools-reference',
|
||||
'reference/toolsets-reference',
|
||||
'reference/mcp-config-reference',
|
||||
'reference/skills-catalog',
|
||||
'reference/optional-skills-catalog',
|
||||
'reference/environment-variables',
|
||||
'reference/faq',
|
||||
],
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
export default sidebars;
|
||||
206
hermes_code/website/src/css/custom.css
Normal file
206
hermes_code/website/src/css/custom.css
Normal file
|
|
@ -0,0 +1,206 @@
|
|||
/**
|
||||
* Hermes Agent — Custom Docusaurus Theme
|
||||
* Matches the landing page branding: amber-on-dark, terminal aesthetic
|
||||
* Colors from landingpage/style.css
|
||||
*/
|
||||
|
||||
/* Import fonts to match landing page */
|
||||
@import url('https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap');
|
||||
|
||||
:root {
|
||||
/* Gold/Amber palette from landing page */
|
||||
--ifm-color-primary: #FFD700;
|
||||
--ifm-color-primary-dark: #E6C200;
|
||||
--ifm-color-primary-darker: #D9B700;
|
||||
--ifm-color-primary-darkest: #B39600;
|
||||
--ifm-color-primary-light: #FFDD33;
|
||||
--ifm-color-primary-lighter: #FFE14D;
|
||||
--ifm-color-primary-lightest: #FFEB80;
|
||||
|
||||
--ifm-font-family-base: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
|
||||
--ifm-font-family-monospace: 'JetBrains Mono', 'Fira Code', 'Cascadia Code', monospace;
|
||||
|
||||
--ifm-code-font-size: 90%;
|
||||
--ifm-heading-font-weight: 600;
|
||||
}
|
||||
|
||||
/* Dark mode — the PRIMARY mode, matches landing page */
|
||||
[data-theme='dark'] {
|
||||
--ifm-color-primary: #FFD700;
|
||||
--ifm-color-primary-dark: #E6C200;
|
||||
--ifm-color-primary-darker: #D9B700;
|
||||
--ifm-color-primary-darkest: #B39600;
|
||||
--ifm-color-primary-light: #FFDD33;
|
||||
--ifm-color-primary-lighter: #FFE14D;
|
||||
--ifm-color-primary-lightest: #FFEB80;
|
||||
|
||||
--ifm-background-color: #07070d;
|
||||
--ifm-background-surface-color: #0f0f18;
|
||||
--ifm-navbar-background-color: #07070dEE;
|
||||
--ifm-footer-background-color: #050509;
|
||||
--ifm-color-emphasis-100: #14142a;
|
||||
--ifm-color-emphasis-200: #1a1a30;
|
||||
|
||||
--ifm-font-color-base: #e8e4dc;
|
||||
--ifm-font-color-secondary: #9a968e;
|
||||
|
||||
--ifm-link-color: #FFD700;
|
||||
--ifm-link-hover-color: #FFBF00;
|
||||
|
||||
--ifm-code-background: #0f0f18;
|
||||
|
||||
--ifm-toc-border-color: rgba(255, 215, 0, 0.08);
|
||||
--ifm-hr-border-color: rgba(255, 215, 0, 0.08);
|
||||
|
||||
--docusaurus-highlighted-code-line-bg: rgba(255, 215, 0, 0.08);
|
||||
}
|
||||
|
||||
/* Subtle dot grid background matching landing page */
|
||||
[data-theme='dark'] .main-wrapper {
|
||||
background-image: radial-gradient(rgba(255, 215, 0, 0.02) 1px, transparent 1px);
|
||||
background-size: 32px 32px;
|
||||
}
|
||||
|
||||
/* Navbar styling */
|
||||
.navbar {
|
||||
backdrop-filter: blur(12px);
|
||||
border-bottom: 1px solid rgba(255, 215, 0, 0.08);
|
||||
}
|
||||
|
||||
.navbar__title {
|
||||
font-weight: 600;
|
||||
letter-spacing: -0.02em;
|
||||
}
|
||||
|
||||
/* Sidebar tweaks */
|
||||
[data-theme='dark'] .menu {
|
||||
background-color: transparent;
|
||||
}
|
||||
|
||||
[data-theme='dark'] .menu__link--active:not(.menu__link--sublist) {
|
||||
background-color: rgba(255, 215, 0, 0.08);
|
||||
border-left: 3px solid #FFD700;
|
||||
padding-left: calc(var(--ifm-menu-link-padding-horizontal) - 3px);
|
||||
}
|
||||
|
||||
/* Code blocks */
|
||||
[data-theme='dark'] .prism-code {
|
||||
background-color: #0a0a12 !important;
|
||||
border: 1px solid rgba(255, 215, 0, 0.06);
|
||||
}
|
||||
|
||||
/* Text diagrams: preserve spacing, disable ligatures, and prefer box-drawing-safe fonts */
|
||||
pre.prism-code.language-text,
|
||||
pre.prism-code.language-plaintext,
|
||||
pre.prism-code.language-txt,
|
||||
pre.prism-code.language-ascii {
|
||||
white-space: pre;
|
||||
overflow-x: auto;
|
||||
line-height: 1.35;
|
||||
font-family: 'JetBrains Mono', 'Cascadia Mono', 'Cascadia Code', 'Fira Code', 'SFMono-Regular', 'DejaVu Sans Mono', 'Liberation Mono', monospace;
|
||||
font-variant-ligatures: none;
|
||||
font-feature-settings: "liga" 0, "calt" 0;
|
||||
text-rendering: optimizeSpeed;
|
||||
}
|
||||
|
||||
pre.prism-code.language-text code,
|
||||
pre.prism-code.language-plaintext code,
|
||||
pre.prism-code.language-txt code,
|
||||
pre.prism-code.language-ascii code {
|
||||
white-space: pre;
|
||||
font-variant-ligatures: none;
|
||||
font-feature-settings: "liga" 0, "calt" 0;
|
||||
}
|
||||
|
||||
.theme-mermaid {
|
||||
margin: 1.5rem 0;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.theme-mermaid svg {
|
||||
max-width: 100%;
|
||||
height: auto;
|
||||
}
|
||||
|
||||
.docs-terminal-figure {
|
||||
display: block;
|
||||
width: 100%;
|
||||
max-width: 900px;
|
||||
margin: 1.25rem auto 0.5rem;
|
||||
border: 1px solid rgba(255, 215, 0, 0.08);
|
||||
border-radius: 12px;
|
||||
background: #0a0a12;
|
||||
}
|
||||
|
||||
.docs-figure-caption {
|
||||
margin-top: 0.35rem;
|
||||
text-align: center;
|
||||
color: var(--ifm-font-color-secondary);
|
||||
font-size: 0.95rem;
|
||||
}
|
||||
|
||||
/* Admonitions — gold-tinted */
|
||||
[data-theme='dark'] .alert--info {
|
||||
--ifm-alert-background-color: rgba(255, 215, 0, 0.05);
|
||||
--ifm-alert-border-color: rgba(255, 215, 0, 0.15);
|
||||
}
|
||||
|
||||
/* Table styling */
|
||||
[data-theme='dark'] table {
|
||||
border-collapse: collapse;
|
||||
}
|
||||
|
||||
[data-theme='dark'] table th {
|
||||
background-color: rgba(255, 215, 0, 0.06);
|
||||
border-color: rgba(255, 215, 0, 0.12);
|
||||
}
|
||||
|
||||
[data-theme='dark'] table td {
|
||||
border-color: rgba(255, 215, 0, 0.06);
|
||||
}
|
||||
|
||||
/* Footer */
|
||||
.footer {
|
||||
border-top: 1px solid rgba(255, 215, 0, 0.08);
|
||||
}
|
||||
|
||||
.footer a {
|
||||
color: #9a968e;
|
||||
transition: color 0.2s;
|
||||
}
|
||||
|
||||
.footer a:hover {
|
||||
color: #FFD700;
|
||||
text-decoration: none;
|
||||
}
|
||||
|
||||
/* Scrollbar */
|
||||
[data-theme='dark'] ::-webkit-scrollbar {
|
||||
width: 8px;
|
||||
height: 8px;
|
||||
}
|
||||
|
||||
[data-theme='dark'] ::-webkit-scrollbar-track {
|
||||
background: #07070d;
|
||||
}
|
||||
|
||||
[data-theme='dark'] ::-webkit-scrollbar-thumb {
|
||||
background: #1a1a30;
|
||||
border-radius: 4px;
|
||||
}
|
||||
|
||||
[data-theme='dark'] ::-webkit-scrollbar-thumb:hover {
|
||||
background: #2a2a40;
|
||||
}
|
||||
|
||||
/* Search bar */
|
||||
[data-theme='dark'] .DocSearch-Button {
|
||||
background-color: #0f0f18;
|
||||
border: 1px solid rgba(255, 215, 0, 0.08);
|
||||
}
|
||||
|
||||
/* Hero banner for docs landing if needed */
|
||||
.hero--hermes {
|
||||
background: linear-gradient(135deg, #07070d 0%, #0f0f18 100%);
|
||||
padding: 4rem 0;
|
||||
}
|
||||
0
hermes_code/website/static/.nojekyll
Normal file
0
hermes_code/website/static/.nojekyll
Normal file
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue