The architecture has been updated

This commit is contained in:
Skyber_2 2026-03-31 23:31:36 +03:00
parent 805f7a017e
commit a01257ead9
1119 changed files with 226 additions and 352 deletions

View file

@ -0,0 +1,6 @@
{
"label": "Guides & Tutorials",
"position": 2,
"collapsible": true,
"collapsed": false
}

View file

@ -0,0 +1,441 @@
---
sidebar_position: 10
---
# Build a Hermes Plugin
This guide walks through building a complete Hermes plugin from scratch. By the end you'll have a working plugin with multiple tools, lifecycle hooks, shipped data files, and a bundled skill — everything the plugin system supports.
## What you're building
A **calculator** plugin with two tools:
- `calculate` — evaluate math expressions (`2**16`, `sqrt(144)`, `pi * 5**2`)
- `unit_convert` — convert between units (`100 F → 37.78 C`, `5 km → 3.11 mi`)
Plus a hook that logs every tool call, and a bundled skill file.
## Step 1: Create the plugin directory
```bash
mkdir -p ~/.hermes/plugins/calculator
cd ~/.hermes/plugins/calculator
```
## Step 2: Write the manifest
Create `plugin.yaml`:
```yaml
name: calculator
version: 1.0.0
description: Math calculator — evaluate expressions and convert units
provides_tools:
- calculate
- unit_convert
provides_hooks:
- post_tool_call
```
This tells Hermes: "I'm a plugin called calculator, I provide tools and hooks." The `provides_tools` and `provides_hooks` fields are lists of what the plugin registers.
Optional fields you could add:
```yaml
author: Your Name
requires_env: # gate loading on env vars
- SOME_API_KEY # plugin disabled if missing
```
## Step 3: Write the tool schemas
Create `schemas.py` — this is what the LLM reads to decide when to call your tools:
```python
"""Tool schemas — what the LLM sees."""
CALCULATE = {
"name": "calculate",
"description": (
"Evaluate a mathematical expression and return the result. "
"Supports arithmetic (+, -, *, /, **), functions (sqrt, sin, cos, "
"log, abs, round, floor, ceil), and constants (pi, e). "
"Use this for any math the user asks about."
),
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Math expression to evaluate (e.g., '2**10', 'sqrt(144)')",
},
},
"required": ["expression"],
},
}
UNIT_CONVERT = {
"name": "unit_convert",
"description": (
"Convert a value between units. Supports length (m, km, mi, ft, in), "
"weight (kg, lb, oz, g), temperature (C, F, K), data (B, KB, MB, GB, TB), "
"and time (s, min, hr, day)."
),
"parameters": {
"type": "object",
"properties": {
"value": {
"type": "number",
"description": "The numeric value to convert",
},
"from_unit": {
"type": "string",
"description": "Source unit (e.g., 'km', 'lb', 'F', 'GB')",
},
"to_unit": {
"type": "string",
"description": "Target unit (e.g., 'mi', 'kg', 'C', 'MB')",
},
},
"required": ["value", "from_unit", "to_unit"],
},
}
```
**Why schemas matter:** The `description` field is how the LLM decides when to use your tool. Be specific about what it does and when to use it. The `parameters` define what arguments the LLM passes.
## Step 4: Write the tool handlers
Create `tools.py` — this is the code that actually executes when the LLM calls your tools:
```python
"""Tool handlers — the code that runs when the LLM calls each tool."""
import json
import math
# Safe globals for expression evaluation — no file/network access
_SAFE_MATH = {
"abs": abs, "round": round, "min": min, "max": max,
"pow": pow, "sqrt": math.sqrt, "sin": math.sin, "cos": math.cos,
"tan": math.tan, "log": math.log, "log2": math.log2, "log10": math.log10,
"floor": math.floor, "ceil": math.ceil,
"pi": math.pi, "e": math.e,
"factorial": math.factorial,
}
def calculate(args: dict, **kwargs) -> str:
"""Evaluate a math expression safely.
Rules for handlers:
1. Receive args (dict) — the parameters the LLM passed
2. Do the work
3. Return a JSON string — ALWAYS, even on error
4. Accept **kwargs for forward compatibility
"""
expression = args.get("expression", "").strip()
if not expression:
return json.dumps({"error": "No expression provided"})
try:
result = eval(expression, {"__builtins__": {}}, _SAFE_MATH)
return json.dumps({"expression": expression, "result": result})
except ZeroDivisionError:
return json.dumps({"expression": expression, "error": "Division by zero"})
except Exception as e:
return json.dumps({"expression": expression, "error": f"Invalid: {e}"})
# Conversion tables — values are in base units
_LENGTH = {"m": 1, "km": 1000, "mi": 1609.34, "ft": 0.3048, "in": 0.0254, "cm": 0.01}
_WEIGHT = {"kg": 1, "g": 0.001, "lb": 0.453592, "oz": 0.0283495}
_DATA = {"B": 1, "KB": 1024, "MB": 1024**2, "GB": 1024**3, "TB": 1024**4}
_TIME = {"s": 1, "ms": 0.001, "min": 60, "hr": 3600, "day": 86400}
def _convert_temp(value, from_u, to_u):
# Normalize to Celsius
c = {"F": (value - 32) * 5/9, "K": value - 273.15}.get(from_u, value)
# Convert to target
return {"F": c * 9/5 + 32, "K": c + 273.15}.get(to_u, c)
def unit_convert(args: dict, **kwargs) -> str:
"""Convert between units."""
value = args.get("value")
from_unit = args.get("from_unit", "").strip()
to_unit = args.get("to_unit", "").strip()
if value is None or not from_unit or not to_unit:
return json.dumps({"error": "Need value, from_unit, and to_unit"})
try:
# Temperature
if from_unit.upper() in {"C","F","K"} and to_unit.upper() in {"C","F","K"}:
result = _convert_temp(float(value), from_unit.upper(), to_unit.upper())
return json.dumps({"input": f"{value} {from_unit}", "result": round(result, 4),
"output": f"{round(result, 4)} {to_unit}"})
# Ratio-based conversions
for table in (_LENGTH, _WEIGHT, _DATA, _TIME):
lc = {k.lower(): v for k, v in table.items()}
if from_unit.lower() in lc and to_unit.lower() in lc:
result = float(value) * lc[from_unit.lower()] / lc[to_unit.lower()]
return json.dumps({"input": f"{value} {from_unit}",
"result": round(result, 6),
"output": f"{round(result, 6)} {to_unit}"})
return json.dumps({"error": f"Cannot convert {from_unit} → {to_unit}"})
except Exception as e:
return json.dumps({"error": f"Conversion failed: {e}"})
```
**Key rules for handlers:**
1. **Signature:** `def my_handler(args: dict, **kwargs) -> str`
2. **Return:** Always a JSON string. Success and errors alike.
3. **Never raise:** Catch all exceptions, return error JSON instead.
4. **Accept `**kwargs`:** Hermes may pass additional context in the future.
## Step 5: Write the registration
Create `__init__.py` — this wires schemas to handlers:
```python
"""Calculator plugin — registration."""
import logging
from . import schemas, tools
logger = logging.getLogger(__name__)
# Track tool usage via hooks
_call_log = []
def _on_post_tool_call(tool_name, args, result, task_id, **kwargs):
"""Hook: runs after every tool call (not just ours)."""
_call_log.append({"tool": tool_name, "session": task_id})
if len(_call_log) > 100:
_call_log.pop(0)
logger.debug("Tool called: %s (session %s)", tool_name, task_id)
def register(ctx):
"""Wire schemas to handlers and register hooks."""
ctx.register_tool(name="calculate", toolset="calculator",
schema=schemas.CALCULATE, handler=tools.calculate)
ctx.register_tool(name="unit_convert", toolset="calculator",
schema=schemas.UNIT_CONVERT, handler=tools.unit_convert)
# This hook fires for ALL tool calls, not just ours
ctx.register_hook("post_tool_call", _on_post_tool_call)
```
**What `register()` does:**
- Called exactly once at startup
- `ctx.register_tool()` puts your tool in the registry — the model sees it immediately
- `ctx.register_hook()` subscribes to lifecycle events
- `ctx.register_command()` — _planned but not yet implemented_
- If this function crashes, the plugin is disabled but Hermes continues fine
## Step 6: Test it
Start Hermes:
```bash
hermes
```
You should see `calculator: calculate, unit_convert` in the banner's tool list.
Try these prompts:
```
What's 2 to the power of 16?
Convert 100 fahrenheit to celsius
What's the square root of 2 times pi?
How many gigabytes is 1.5 terabytes?
```
Check plugin status:
```
/plugins
```
Output:
```
Plugins (1):
✓ calculator v1.0.0 (2 tools, 1 hooks)
```
## Your plugin's final structure
```
~/.hermes/plugins/calculator/
├── plugin.yaml # "I'm calculator, I provide tools and hooks"
├── __init__.py # Wiring: schemas → handlers, register hooks
├── schemas.py # What the LLM reads (descriptions + parameter specs)
└── tools.py # What runs (calculate, unit_convert functions)
```
Four files, clear separation:
- **Manifest** declares what the plugin is
- **Schemas** describe tools for the LLM
- **Handlers** implement the actual logic
- **Registration** connects everything
## What else can plugins do?
### Ship data files
Put any files in your plugin directory and read them at import time:
```python
# In tools.py or __init__.py
from pathlib import Path
_PLUGIN_DIR = Path(__file__).parent
_DATA_FILE = _PLUGIN_DIR / "data" / "languages.yaml"
with open(_DATA_FILE) as f:
_DATA = yaml.safe_load(f)
```
### Bundle a skill
Include a `skill.md` file and install it during registration:
```python
import shutil
from pathlib import Path
def _install_skill():
"""Copy our skill to ~/.hermes/skills/ on first load."""
try:
from hermes_cli.config import get_hermes_home
dest = get_hermes_home() / "skills" / "my-plugin" / "SKILL.md"
except Exception:
dest = Path.home() / ".hermes" / "skills" / "my-plugin" / "SKILL.md"
if dest.exists():
return # don't overwrite user edits
source = Path(__file__).parent / "skill.md"
if source.exists():
dest.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(source, dest)
def register(ctx):
ctx.register_tool(...)
_install_skill()
```
### Gate on environment variables
If your plugin needs an API key:
```yaml
# plugin.yaml
requires_env:
- WEATHER_API_KEY
```
If `WEATHER_API_KEY` isn't set, the plugin is disabled with a clear message. No crash, no error in the agent — just "Plugin weather disabled (missing: WEATHER_API_KEY)".
### Conditional tool availability
For tools that depend on optional libraries:
```python
ctx.register_tool(
name="my_tool",
schema={...},
handler=my_handler,
check_fn=lambda: _has_optional_lib(), # False = tool hidden from model
)
```
### Register multiple hooks
```python
def register(ctx):
ctx.register_hook("pre_tool_call", before_any_tool)
ctx.register_hook("post_tool_call", after_any_tool)
ctx.register_hook("on_session_start", on_new_session)
ctx.register_hook("on_session_end", on_session_end)
```
Available hooks:
| Hook | When | Arguments |
|------|------|-----------|
| `pre_tool_call` | Before any tool runs | `tool_name`, `args`, `task_id` |
| `post_tool_call` | After any tool returns | `tool_name`, `args`, `result`, `task_id` |
| `pre_llm_call` | Before LLM API call | `messages`, `model` |
| `post_llm_call` | After LLM response | `messages`, `response`, `model` |
| `on_session_start` | Session begins | `session_id`, `platform` |
| `on_session_end` | Session ends | `session_id`, `platform` |
Hooks are observers — they can't modify arguments or return values. If a hook crashes, it's logged and skipped; other hooks and the tool continue normally.
### Distribute via pip
For sharing plugins publicly, add an entry point to your Python package:
```toml
# pyproject.toml
[project.entry-points."hermes_agent.plugins"]
my-plugin = "my_plugin_package"
```
```bash
pip install hermes-plugin-calculator
# Plugin auto-discovered on next hermes startup
```
## Common mistakes
**Handler doesn't return JSON string:**
```python
# Wrong — returns a dict
def handler(args, **kwargs):
return {"result": 42}
# Right — returns a JSON string
def handler(args, **kwargs):
return json.dumps({"result": 42})
```
**Missing `**kwargs` in handler signature:**
```python
# Wrong — will break if Hermes passes extra context
def handler(args):
...
# Right
def handler(args, **kwargs):
...
```
**Handler raises exceptions:**
```python
# Wrong — exception propagates, tool call fails
def handler(args, **kwargs):
result = 1 / int(args["value"]) # ZeroDivisionError!
return json.dumps({"result": result})
# Right — catch and return error JSON
def handler(args, **kwargs):
try:
result = 1 / int(args.get("value", 0))
return json.dumps({"result": result})
except Exception as e:
return json.dumps({"error": str(e)})
```
**Schema description too vague:**
```python
# Bad — model doesn't know when to use it
"description": "Does stuff"
# Good — model knows exactly when and how
"description": "Evaluate a mathematical expression. Use for arithmetic, trig, logarithms. Supports: +, -, *, /, **, sqrt, sin, cos, log, pi, e."
```

View file

@ -0,0 +1,266 @@
---
sidebar_position: 2
title: "Tutorial: Daily Briefing Bot"
description: "Build an automated daily briefing bot that researches topics, summarizes findings, and delivers them to Telegram or Discord every morning"
---
# Tutorial: Build a Daily Briefing Bot
In this tutorial, you'll build a personal briefing bot that wakes up every morning, researches topics you care about, summarizes the findings, and delivers a concise briefing straight to your Telegram or Discord.
By the end, you'll have a fully automated workflow combining **web search**, **cron scheduling**, **delegation**, and **messaging delivery** — no code required.
## What We're Building
Here's the flow:
1. **8:00 AM** — The cron scheduler triggers your job
2. **Hermes spins up** a fresh agent session with your prompt
3. **Web search** pulls the latest news on your topics
4. **Summarization** distills it into a clean briefing format
5. **Delivery** sends the briefing to your Telegram or Discord
The whole thing runs hands-free. You just read your briefing with your morning coffee.
## Prerequisites
Before starting, make sure you have:
- **Hermes Agent installed** — see the [Installation guide](/docs/getting-started/installation)
- **Gateway running** — the gateway daemon handles cron execution:
```bash
hermes gateway install # Install as a user service
sudo hermes gateway install --system # Linux servers: boot-time system service
# or
hermes gateway # Run in foreground
```
- **Firecrawl API key** — set `FIRECRAWL_API_KEY` in your environment for web search
- **Messaging configured** (optional but recommended) — [Telegram](/docs/user-guide/messaging/telegram) or Discord set up with a home channel
:::tip No messaging? No problem
You can still follow this tutorial using `deliver: "local"`. Briefings will be saved to `~/.hermes/cron/output/` and you can read them anytime.
:::
## Step 1: Test the Workflow Manually
Before automating anything, let's make sure the briefing works. Start a chat session:
```bash
hermes
```
Then enter this prompt:
```
Search for the latest news about AI agents and open source LLMs.
Summarize the top 3 stories in a concise briefing format with links.
```
Hermes will search the web, read through results, and produce something like:
```
☀️ Your AI Briefing — March 8, 2026
1. Qwen 3 Released with 235B Parameters
Alibaba's latest open-weight model matches GPT-4.5 on several
benchmarks while remaining fully open source.
→ https://qwenlm.github.io/blog/qwen3/
2. LangChain Launches Agent Protocol Standard
A new open standard for agent-to-agent communication gains
adoption from 15 major frameworks in its first week.
→ https://blog.langchain.dev/agent-protocol/
3. EU AI Act Enforcement Begins for General-Purpose Models
The first compliance deadlines hit, with open source models
receiving exemptions under the 10M parameter threshold.
→ https://artificialintelligenceact.eu/updates/
---
3 stories • Sources searched: 8 • Generated by Hermes Agent
```
If this works, you're ready to automate it.
:::tip Iterate on the format
Try different prompts until you get output you love. Add instructions like "use emoji headers" or "keep each summary under 2 sentences." Whatever you settle on goes into the cron job.
:::
## Step 2: Create the Cron Job
Now let's schedule this to run automatically every morning. You can do this in two ways.
### Option A: Natural Language (in chat)
Just tell Hermes what you want:
```
Every morning at 8am, search the web for the latest news about AI agents
and open source LLMs. Summarize the top 3 stories in a concise briefing
with links. Use a friendly, professional tone. Deliver to telegram.
```
Hermes will create the cron job for you using the unified `cronjob` tool.
### Option B: CLI Slash Command
Use the `/cron` command for more control:
```
/cron add "0 8 * * *" "Search the web for the latest news about AI agents and open source LLMs. Find at least 5 recent articles from the past 24 hours. Summarize the top 3 most important stories in a concise daily briefing format. For each story include: a clear headline, a 2-sentence summary, and the source URL. Use a friendly, professional tone. Format with emoji bullet points and end with a total story count."
```
### The Golden Rule: Self-Contained Prompts
:::warning Critical concept
Cron jobs run in a **completely fresh session** — no memory of your previous conversations, no context about what you "set up earlier." Your prompt must contain **everything** the agent needs to do the job.
:::
**Bad prompt:**
```
Do my usual morning briefing.
```
**Good prompt:**
```
Search the web for the latest news about AI agents and open source LLMs.
Find at least 5 recent articles from the past 24 hours. Summarize the
top 3 most important stories in a concise daily briefing format. For each
story include: a clear headline, a 2-sentence summary, and the source URL.
Use a friendly, professional tone. Format with emoji bullet points.
```
The good prompt is specific about **what to search**, **how many articles**, **what format**, and **what tone**. It's everything the agent needs in one shot.
## Step 3: Customize the Briefing
Once the basic briefing works, you can get creative.
### Multi-Topic Briefings
Cover several areas in one briefing:
```
/cron add "0 8 * * *" "Create a morning briefing covering three topics. For each topic, search the web for recent news from the past 24 hours and summarize the top 2 stories with links.
Topics:
1. AI and machine learning — focus on open source models and agent frameworks
2. Cryptocurrency — focus on Bitcoin, Ethereum, and regulatory news
3. Space exploration — focus on SpaceX, NASA, and commercial space
Format as a clean briefing with section headers and emoji. End with today's date and a motivational quote."
```
### Using Delegation for Parallel Research
For faster briefings, tell Hermes to delegate each topic to a sub-agent:
```
/cron add "0 8 * * *" "Create a morning briefing by delegating research to sub-agents. Delegate three parallel tasks:
1. Delegate: Search for the top 2 AI/ML news stories from the past 24 hours with links
2. Delegate: Search for the top 2 cryptocurrency news stories from the past 24 hours with links
3. Delegate: Search for the top 2 space exploration news stories from the past 24 hours with links
Collect all results and combine them into a single clean briefing with section headers, emoji formatting, and source links. Add today's date as a header."
```
Each sub-agent searches independently and in parallel, then the main agent combines everything into one polished briefing. See the [Delegation docs](/docs/user-guide/features/delegation) for more on how this works.
### Weekday-Only Schedule
Don't need briefings on weekends? Use a cron expression that targets MondayFriday:
```
/cron add "0 8 * * 1-5" "Search for the latest AI and tech news..."
```
### Twice-Daily Briefings
Get a morning overview and an evening recap:
```
/cron add "0 8 * * *" "Morning briefing: search for AI news from the past 12 hours..."
/cron add "0 18 * * *" "Evening recap: search for AI news from the past 12 hours..."
```
### Adding Personal Context with Memory
If you have [memory](/docs/user-guide/features/memory) enabled, you can store preferences that persist across sessions. But remember — cron jobs run in fresh sessions without conversational memory. To add personal context, bake it directly into the prompt:
```
/cron add "0 8 * * *" "You are creating a briefing for a senior ML engineer who cares about: PyTorch ecosystem, transformer architectures, open-weight models, and AI regulation in the EU. Skip stories about product launches or funding rounds unless they involve open source.
Search for the latest news on these topics. Summarize the top 3 stories with links. Be concise and technical — this reader doesn't need basic explanations."
```
:::tip Tailor the persona
Including details about who the briefing is *for* dramatically improves relevance. Tell the agent your role, interests, and what to skip.
:::
## Step 4: Manage Your Jobs
### List All Scheduled Jobs
In chat:
```
/cron list
```
Or from the terminal:
```bash
hermes cron list
```
You'll see output like:
```
ID | Name | Schedule | Next Run | Deliver
------------|-------------------|-------------|--------------------|--------
a1b2c3d4 | Morning Briefing | 0 8 * * * | 2026-03-09 08:00 | telegram
e5f6g7h8 | Evening Recap | 0 18 * * * | 2026-03-08 18:00 | telegram
```
### Remove a Job
In chat:
```
/cron remove a1b2c3d4
```
Or ask conversationally:
```
Remove my morning briefing cron job.
```
Hermes will use `cronjob(action="list")` to find it and `cronjob(action="remove")` to delete it.
### Check Gateway Status
Make sure the scheduler is actually running:
```bash
hermes cron status
```
If the gateway isn't running, your jobs won't execute. Install it as a background service for reliability:
```bash
hermes gateway install
# or on Linux servers
sudo hermes gateway install --system
```
## Going Further
You've built a working daily briefing bot. Here are some directions to explore next:
- **[Scheduled Tasks (Cron)](/docs/user-guide/features/cron)** — Full reference for schedule formats, repeat limits, and delivery options
- **[Delegation](/docs/user-guide/features/delegation)** — Deep dive into parallel sub-agent workflows
- **[Messaging Platforms](/docs/user-guide/messaging)** — Set up Telegram, Discord, or other delivery targets
- **[Memory](/docs/user-guide/features/memory)** — Persistent context across sessions
- **[Tips & Best Practices](/docs/guides/tips)** — More prompt engineering advice
:::tip What else can you schedule?
The briefing bot pattern works for anything: competitor monitoring, GitHub repo summaries, weather forecasts, portfolio tracking, server health checks, or even a daily joke. If you can describe it in a prompt, you can schedule it.
:::

View file

@ -0,0 +1,340 @@
---
sidebar_position: 4
title: "Using Hermes as a Python Library"
description: "Embed AIAgent in your own Python scripts, web apps, or automation pipelines — no CLI required"
---
# Using Hermes as a Python Library
Hermes isn't just a CLI tool. You can import `AIAgent` directly and use it programmatically in your own Python scripts, web applications, or automation pipelines. This guide shows you how.
---
## Installation
Install Hermes directly from the repository:
```bash
pip install git+https://github.com/NousResearch/hermes-agent.git
```
Or with [uv](https://docs.astral.sh/uv/):
```bash
uv pip install git+https://github.com/NousResearch/hermes-agent.git
```
You can also pin it in your `requirements.txt`:
```text
hermes-agent @ git+https://github.com/NousResearch/hermes-agent.git
```
:::tip
The same environment variables used by the CLI are required when using Hermes as a library. At minimum, set `OPENROUTER_API_KEY` (or `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` if using direct provider access).
:::
---
## Basic Usage
The simplest way to use Hermes is the `chat()` method — pass a message, get a string back:
```python
from run_agent import AIAgent
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
)
response = agent.chat("What is the capital of France?")
print(response)
```
`chat()` handles the full conversation loop internally — tool calls, retries, everything — and returns just the final text response.
:::warning
Always set `quiet_mode=True` when embedding Hermes in your own code. Without it, the agent prints CLI spinners, progress indicators, and other terminal output that will clutter your application's output.
:::
---
## Full Conversation Control
For more control over the conversation, use `run_conversation()` directly. It returns a dictionary with the full response, message history, and metadata:
```python
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
)
result = agent.run_conversation(
user_message="Search for recent Python 3.13 features",
task_id="my-task-1",
)
print(result["final_response"])
print(f"Messages exchanged: {len(result['messages'])}")
```
The returned dictionary contains:
- **`final_response`** — The agent's final text reply
- **`messages`** — The complete message history (system, user, assistant, tool calls)
- **`task_id`** — The task identifier used for VM isolation
You can also pass a custom system message that overrides the ephemeral system prompt for that call:
```python
result = agent.run_conversation(
user_message="Explain quicksort",
system_message="You are a computer science tutor. Use simple analogies.",
)
```
---
## Configuring Tools
Control which toolsets the agent has access to using `enabled_toolsets` or `disabled_toolsets`:
```python
# Only enable web tools (browsing, search)
agent = AIAgent(
model="anthropic/claude-sonnet-4",
enabled_toolsets=["web"],
quiet_mode=True,
)
# Enable everything except terminal access
agent = AIAgent(
model="anthropic/claude-sonnet-4",
disabled_toolsets=["terminal"],
quiet_mode=True,
)
```
:::tip
Use `enabled_toolsets` when you want a minimal, locked-down agent (e.g., only web search for a research bot). Use `disabled_toolsets` when you want most capabilities but need to restrict specific ones (e.g., no terminal access in a shared environment).
:::
---
## Multi-turn Conversations
Maintain conversation state across multiple turns by passing the message history back in:
```python
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
)
# First turn
result1 = agent.run_conversation("My name is Alice")
history = result1["messages"]
# Second turn — agent remembers the context
result2 = agent.run_conversation(
"What's my name?",
conversation_history=history,
)
print(result2["final_response"]) # "Your name is Alice."
```
The `conversation_history` parameter accepts the `messages` list from a previous result. The agent copies it internally, so your original list is never mutated.
---
## Saving Trajectories
Enable trajectory saving to capture conversations in ShareGPT format — useful for generating training data or debugging:
```python
agent = AIAgent(
model="anthropic/claude-sonnet-4",
save_trajectories=True,
quiet_mode=True,
)
agent.chat("Write a Python function to sort a list")
# Saves to trajectory_samples.jsonl in ShareGPT format
```
Each conversation is appended as a single JSONL line, making it easy to collect datasets from automated runs.
---
## Custom System Prompts
Use `ephemeral_system_prompt` to set a custom system prompt that guides the agent's behavior but is **not** saved to trajectory files (keeping your training data clean):
```python
agent = AIAgent(
model="anthropic/claude-sonnet-4",
ephemeral_system_prompt="You are a SQL expert. Only answer database questions.",
quiet_mode=True,
)
response = agent.chat("How do I write a JOIN query?")
print(response)
```
This is ideal for building specialized agents — a code reviewer, a documentation writer, a SQL assistant — all using the same underlying tooling.
---
## Batch Processing
For running many prompts in parallel, Hermes includes `batch_runner.py`. It manages concurrent `AIAgent` instances with proper resource isolation:
```bash
python batch_runner.py --input prompts.jsonl --output results.jsonl
```
Each prompt gets its own `task_id` and isolated environment. If you need custom batch logic, you can build your own using `AIAgent` directly:
```python
import concurrent.futures
from run_agent import AIAgent
prompts = [
"Explain recursion",
"What is a hash table?",
"How does garbage collection work?",
]
def process_prompt(prompt):
# Create a fresh agent per task for thread safety
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
skip_memory=True,
)
return agent.chat(prompt)
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
results = list(executor.map(process_prompt, prompts))
for prompt, result in zip(prompts, results):
print(f"Q: {prompt}\nA: {result}\n")
```
:::warning
Always create a **new `AIAgent` instance per thread or task**. The agent maintains internal state (conversation history, tool sessions, iteration counters) that is not thread-safe to share.
:::
---
## Integration Examples
### FastAPI Endpoint
```python
from fastapi import FastAPI
from pydantic import BaseModel
from run_agent import AIAgent
app = FastAPI()
class ChatRequest(BaseModel):
message: str
model: str = "anthropic/claude-sonnet-4"
@app.post("/chat")
async def chat(request: ChatRequest):
agent = AIAgent(
model=request.model,
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
response = agent.chat(request.message)
return {"response": response}
```
### Discord Bot
```python
import discord
from run_agent import AIAgent
client = discord.Client(intents=discord.Intents.default())
@client.event
async def on_message(message):
if message.author == client.user:
return
if message.content.startswith("!hermes "):
query = message.content[8:]
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
platform="discord",
)
response = agent.chat(query)
await message.channel.send(response[:2000])
client.run("YOUR_DISCORD_TOKEN")
```
### CI/CD Pipeline Step
```python
#!/usr/bin/env python3
"""CI step: auto-review a PR diff."""
import subprocess
from run_agent import AIAgent
diff = subprocess.check_output(["git", "diff", "main...HEAD"]).decode()
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
disabled_toolsets=["terminal", "browser"],
)
review = agent.chat(
f"Review this PR diff for bugs, security issues, and style problems:\n\n{diff}"
)
print(review)
```
---
## Key Constructor Parameters
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model` | `str` | `"anthropic/claude-opus-4.6"` | Model in OpenRouter format |
| `quiet_mode` | `bool` | `False` | Suppress CLI output |
| `enabled_toolsets` | `List[str]` | `None` | Whitelist specific toolsets |
| `disabled_toolsets` | `List[str]` | `None` | Blacklist specific toolsets |
| `save_trajectories` | `bool` | `False` | Save conversations to JSONL |
| `ephemeral_system_prompt` | `str` | `None` | Custom system prompt (not saved to trajectories) |
| `max_iterations` | `int` | `90` | Max tool-calling iterations per conversation |
| `skip_context_files` | `bool` | `False` | Skip loading AGENTS.md files |
| `skip_memory` | `bool` | `False` | Disable persistent memory read/write |
| `api_key` | `str` | `None` | API key (falls back to env vars) |
| `base_url` | `str` | `None` | Custom API endpoint URL |
| `platform` | `str` | `None` | Platform hint (`"discord"`, `"telegram"`, etc.) |
---
## Important Notes
:::tip
- Set **`skip_context_files=True`** if you don't want `AGENTS.md` files from the working directory loaded into the system prompt.
- Set **`skip_memory=True`** to prevent the agent from reading or writing persistent memory — recommended for stateless API endpoints.
- The `platform` parameter (e.g., `"discord"`, `"telegram"`) injects platform-specific formatting hints so the agent adapts its output style.
:::
:::warning
- **Thread safety**: Create one `AIAgent` per thread or task. Never share an instance across concurrent calls.
- **Resource cleanup**: The agent automatically cleans up resources (terminal sessions, browser instances) when a conversation ends. If you're running in a long-lived process, ensure each conversation completes normally.
- **Iteration limits**: The default `max_iterations=90` is generous. For simple Q&A use cases, consider lowering it (e.g., `max_iterations=10`) to prevent runaway tool-calling loops and control costs.
:::

View file

@ -0,0 +1,437 @@
---
sidebar_position: 3
title: "Tutorial: Team Telegram Assistant"
description: "Step-by-step guide to setting up a Telegram bot that your whole team can use for code help, research, system admin, and more"
---
# Set Up a Team Telegram Assistant
This tutorial walks you through setting up a Telegram bot powered by Hermes Agent that multiple team members can use. By the end, your team will have a shared AI assistant they can message for help with code, research, system administration, and anything else — secured with per-user authorization.
## What We're Building
A Telegram bot that:
- **Any authorized team member** can DM for help — code reviews, research, shell commands, debugging
- **Runs on your server** with full tool access — terminal, file editing, web search, code execution
- **Per-user sessions** — each person gets their own conversation context
- **Secure by default** — only approved users can interact, with two authorization methods
- **Scheduled tasks** — daily standups, health checks, and reminders delivered to a team channel
---
## Prerequisites
Before starting, make sure you have:
- **Hermes Agent installed** on a server or VPS (not your laptop — the bot needs to stay running). Follow the [installation guide](/getting-started/learning-path) if you haven't yet.
- **A Telegram account** for yourself (the bot owner)
- **An LLM provider configured** — at minimum, an API key for OpenAI, Anthropic, or another supported provider in `~/.hermes/.env`
:::tip
A $5/month VPS is plenty for running the gateway. Hermes itself is lightweight — the LLM API calls are what cost money, and those happen remotely.
:::
---
## Step 1: Create a Telegram Bot
Every Telegram bot starts with **@BotFather** — Telegram's official bot for creating bots.
1. **Open Telegram** and search for `@BotFather`, or go to [t.me/BotFather](https://t.me/BotFather)
2. **Send `/newbot`** — BotFather will ask you two things:
- **Display name** — what users see (e.g., `Team Hermes Assistant`)
- **Username** — must end in `bot` (e.g., `myteam_hermes_bot`)
3. **Copy the bot token** — BotFather replies with something like:
```
Use this token to access the HTTP API:
7123456789:AAH1bGciOiJSUzI1NiIsInR5cCI6Ikp...
```
Save this token — you'll need it in the next step.
4. **Set a description** (optional but recommended):
```
/setdescription
```
Choose your bot, then enter something like:
```
Team AI assistant powered by Hermes Agent. DM me for help with code, research, debugging, and more.
```
5. **Set bot commands** (optional — gives users a command menu):
```
/setcommands
```
Choose your bot, then paste:
```
new - Start a fresh conversation
model - Show or change the AI model
status - Show session info
help - Show available commands
stop - Stop the current task
```
:::warning
Keep your bot token secret. Anyone with the token can control the bot. If it leaks, use `/revoke` in BotFather to generate a new one.
:::
---
## Step 2: Configure the Gateway
You have two options: the interactive setup wizard (recommended) or manual configuration.
### Option A: Interactive Setup (Recommended)
```bash
hermes gateway setup
```
This walks you through everything with arrow-key selection. Pick **Telegram**, paste your bot token, and enter your user ID when prompted.
### Option B: Manual Configuration
Add these lines to `~/.hermes/.env`:
```bash
# Telegram bot token from BotFather
TELEGRAM_BOT_TOKEN=7123456789:AAH1bGciOiJSUzI1NiIsInR5cCI6Ikp...
# Your Telegram user ID (numeric)
TELEGRAM_ALLOWED_USERS=123456789
```
### Finding Your User ID
Your Telegram user ID is a numeric value (not your username). To find it:
1. Message [@userinfobot](https://t.me/userinfobot) on Telegram
2. It instantly replies with your numeric user ID
3. Copy that number into `TELEGRAM_ALLOWED_USERS`
:::info
Telegram user IDs are permanent numbers like `123456789`. They're different from your `@username`, which can change. Always use the numeric ID for allowlists.
:::
---
## Step 3: Start the Gateway
### Quick Test
Run the gateway in the foreground first to make sure everything works:
```bash
hermes gateway
```
You should see output like:
```
[Gateway] Starting Hermes Gateway...
[Gateway] Telegram adapter connected
[Gateway] Cron scheduler started (tick every 60s)
```
Open Telegram, find your bot, and send it a message. If it replies, you're in business. Press `Ctrl+C` to stop.
### Production: Install as a Service
For a persistent deployment that survives reboots:
```bash
hermes gateway install
sudo hermes gateway install --system # Linux only: boot-time system service
```
This creates a background service: a user-level **systemd** service on Linux by default, a **launchd** service on macOS, or a boot-time Linux system service if you pass `--system`.
```bash
# Linux — manage the default user service
hermes gateway start
hermes gateway stop
hermes gateway status
# View live logs
journalctl --user -u hermes-gateway -f
# Keep running after SSH logout
sudo loginctl enable-linger $USER
# Linux servers — explicit system-service commands
sudo hermes gateway start --system
sudo hermes gateway status --system
journalctl -u hermes-gateway -f
```
```bash
# macOS — manage the service
launchctl start ai.hermes.gateway
launchctl stop ai.hermes.gateway
tail -f ~/.hermes/logs/gateway.log
```
### Verify It's Running
```bash
hermes gateway status
```
Then send a test message to your bot on Telegram. You should get a response within a few seconds.
---
## Step 4: Set Up Team Access
Now let's give your teammates access. There are two approaches.
### Approach A: Static Allowlist
Collect each team member's Telegram user ID (have them message [@userinfobot](https://t.me/userinfobot)) and add them as a comma-separated list:
```bash
# In ~/.hermes/.env
TELEGRAM_ALLOWED_USERS=123456789,987654321,555555555
```
Restart the gateway after changes:
```bash
hermes gateway stop && hermes gateway start
```
### Approach B: DM Pairing (Recommended for Teams)
DM pairing is more flexible — you don't need to collect user IDs upfront. Here's how it works:
1. **Teammate DMs the bot** — since they're not on the allowlist, the bot replies with a one-time pairing code:
```
🔐 Pairing code: XKGH5N7P
Send this code to the bot owner for approval.
```
2. **Teammate sends you the code** (via any channel — Slack, email, in person)
3. **You approve it** on the server:
```bash
hermes pairing approve telegram XKGH5N7P
```
4. **They're in** — the bot immediately starts responding to their messages
**Managing paired users:**
```bash
# See all pending and approved users
hermes pairing list
# Revoke someone's access
hermes pairing revoke telegram 987654321
# Clear expired pending codes
hermes pairing clear-pending
```
:::tip
DM pairing is ideal for teams because you don't need to restart the gateway when adding new users. Approvals take effect immediately.
:::
### Security Considerations
- **Never set `GATEWAY_ALLOW_ALL_USERS=true`** on a bot with terminal access — anyone who finds your bot could run commands on your server
- Pairing codes expire after **1 hour** and use cryptographic randomness
- Rate limiting prevents brute-force attacks: 1 request per user per 10 minutes, max 3 pending codes per platform
- After 5 failed approval attempts, the platform enters a 1-hour lockout
- All pairing data is stored with `chmod 0600` permissions
---
## Step 5: Configure the Bot
### Set a Home Channel
A **home channel** is where the bot delivers cron job results and proactive messages. Without one, scheduled tasks have nowhere to send output.
**Option 1:** Use the `/sethome` command in any Telegram group or chat where the bot is a member.
**Option 2:** Set it manually in `~/.hermes/.env`:
```bash
TELEGRAM_HOME_CHANNEL=-1001234567890
TELEGRAM_HOME_CHANNEL_NAME="Team Updates"
```
To find a channel ID, add [@userinfobot](https://t.me/userinfobot) to the group — it will report the group's chat ID.
### Configure Tool Progress Display
Control how much detail the bot shows when using tools. In `~/.hermes/config.yaml`:
```yaml
display:
tool_progress: new # off | new | all | verbose
```
| Mode | What You See |
|------|-------------|
| `off` | Clean responses only — no tool activity |
| `new` | Brief status for each new tool call (recommended for messaging) |
| `all` | Every tool call with details |
| `verbose` | Full tool output including command results |
Users can also change this per-session with the `/verbose` command in chat.
### Set Up a Personality with SOUL.md
Customize how the bot communicates by editing `~/.hermes/SOUL.md`:
For a full guide, see [Use SOUL.md with Hermes](/docs/guides/use-soul-with-hermes).
```markdown
# Soul
You are a helpful team assistant. Be concise and technical.
Use code blocks for any code. Skip pleasantries — the team
values directness. When debugging, always ask for error logs
before guessing at solutions.
```
### Add Project Context
If your team works on specific projects, create context files so the bot knows your stack:
```markdown
<!-- ~/.hermes/AGENTS.md -->
# Team Context
- We use Python 3.12 with FastAPI and SQLAlchemy
- Frontend is React with TypeScript
- CI/CD runs on GitHub Actions
- Production deploys to AWS ECS
- Always suggest writing tests for new code
```
:::info
Context files are injected into every session's system prompt. Keep them concise — every character counts against your token budget.
:::
---
## Step 6: Set Up Scheduled Tasks
With the gateway running, you can schedule recurring tasks that deliver results to your team channel.
### Daily Standup Summary
Message the bot on Telegram:
```
Every weekday at 9am, check the GitHub repository at
github.com/myorg/myproject for:
1. Pull requests opened/merged in the last 24 hours
2. Issues created or closed
3. Any CI/CD failures on the main branch
Format as a brief standup-style summary.
```
The agent creates a cron job automatically and delivers results to the chat where you asked (or the home channel).
### Server Health Check
```
Every 6 hours, check disk usage with 'df -h', memory with 'free -h',
and Docker container status with 'docker ps'. Report anything unusual —
partitions above 80%, containers that have restarted, or high memory usage.
```
### Managing Scheduled Tasks
```bash
# From the CLI
hermes cron list # View all scheduled jobs
hermes cron status # Check if scheduler is running
# From Telegram chat
/cron list # View jobs
/cron remove <job_id> # Remove a job
```
:::warning
Cron job prompts run in completely fresh sessions with no memory of prior conversations. Make sure each prompt contains **all** the context the agent needs — file paths, URLs, server addresses, and clear instructions.
:::
---
## Production Tips
### Use Docker for Safety
On a shared team bot, use Docker as the terminal backend so agent commands run in a container instead of on your host:
```bash
# In ~/.hermes/.env
TERMINAL_BACKEND=docker
TERMINAL_DOCKER_IMAGE=nikolaik/python-nodejs:python3.11-nodejs20
```
Or in `~/.hermes/config.yaml`:
```yaml
terminal:
backend: docker
container_cpu: 1
container_memory: 5120
container_persistent: true
```
This way, even if someone asks the bot to run something destructive, your host system is protected.
### Monitor the Gateway
```bash
# Check if the gateway is running
hermes gateway status
# Watch live logs (Linux)
journalctl --user -u hermes-gateway -f
# Watch live logs (macOS)
tail -f ~/.hermes/logs/gateway.log
```
### Keep Hermes Updated
From Telegram, send `/update` to the bot — it will pull the latest version and restart. Or from the server:
```bash
hermes update
hermes gateway stop && hermes gateway start
```
### Log Locations
| What | Location |
|------|----------|
| Gateway logs | `journalctl --user -u hermes-gateway` (Linux) or `~/.hermes/logs/gateway.log` (macOS) |
| Cron job output | `~/.hermes/cron/output/{job_id}/{timestamp}.md` |
| Cron job definitions | `~/.hermes/cron/jobs.json` |
| Pairing data | `~/.hermes/pairing/` |
| Session history | `~/.hermes/sessions/` |
---
## Going Further
You've got a working team Telegram assistant. Here are some next steps:
- **[Security Guide](/user-guide/security)** — deep dive into authorization, container isolation, and command approval
- **[Messaging Gateway](/user-guide/messaging)** — full reference for gateway architecture, session management, and chat commands
- **[Telegram Setup](/user-guide/messaging/telegram)** — platform-specific details including voice messages and TTS
- **[Scheduled Tasks](/user-guide/features/cron)** — advanced cron scheduling with delivery options and cron expressions
- **[Context Files](/user-guide/features/context-files)** — AGENTS.md, SOUL.md, and .cursorrules for project knowledge
- **[Personality](/user-guide/features/personality)** — built-in personality presets and custom persona definitions
- **Add more platforms** — the same gateway can simultaneously run [Discord](/user-guide/messaging/discord), [Slack](/user-guide/messaging/slack), and [WhatsApp](/user-guide/messaging/whatsapp)
---
*Questions or issues? Open an issue on GitHub — contributions are welcome.*

View file

@ -0,0 +1,234 @@
---
sidebar_position: 1
title: "Tips & Best Practices"
description: "Practical advice to get the most out of Hermes Agent — prompt tips, CLI shortcuts, context files, memory, cost optimization, and security"
---
# Tips & Best Practices
A quick-wins collection of practical tips that make you immediately more effective with Hermes Agent. Each section targets a different aspect — scan the headers and jump to what's relevant.
---
## Getting the Best Results
### Be Specific About What You Want
Vague prompts produce vague results. Instead of "fix the code," say "fix the TypeError in `api/handlers.py` on line 47 — the `process_request()` function receives `None` from `parse_body()`." The more context you give, the fewer iterations you need.
### Provide Context Up Front
Front-load your request with the relevant details: file paths, error messages, expected behavior. One well-crafted message beats three rounds of clarification. Paste error tracebacks directly — the agent can parse them.
### Use Context Files for Recurring Instructions
If you find yourself repeating the same instructions ("use tabs not spaces," "we use pytest," "the API is at `/api/v2`"), put them in an `AGENTS.md` file. The agent reads it automatically every session — zero effort after setup.
### Let the Agent Use Its Tools
Don't try to hand-hold every step. Say "find and fix the failing test" rather than "open `tests/test_foo.py`, look at line 42, then..." The agent has file search, terminal access, and code execution — let it explore and iterate.
### Use Skills for Complex Workflows
Before writing a long prompt explaining how to do something, check if there's already a skill for it. Type `/skills` to browse available skills, or just invoke one directly like `/axolotl` or `/github-pr-workflow`.
## CLI Power User Tips
### Multi-Line Input
Press **Alt+Enter** (or **Ctrl+J**) to insert a newline without sending. This lets you compose multi-line prompts, paste code blocks, or structure complex requests before hitting Enter to send.
### Paste Detection
The CLI auto-detects multi-line pastes. Just paste a code block or error traceback directly — it won't send each line as a separate message. The paste is buffered and sent as one message.
### Interrupt and Redirect
Press **Ctrl+C** once to interrupt the agent mid-response. You can then type a new message to redirect it. Double-press Ctrl+C within 2 seconds to force exit. This is invaluable when the agent starts going down the wrong path.
### Resume Sessions with `-c`
Forgot something from your last session? Run `hermes -c` to resume exactly where you left off, with full conversation history restored. You can also resume by title: `hermes -r "my research project"`.
### Clipboard Image Paste
Press **Ctrl+V** to paste an image from your clipboard directly into the chat. The agent uses vision to analyze screenshots, diagrams, error popups, or UI mockups — no need to save to a file first.
### Slash Command Autocomplete
Type `/` and press **Tab** to see all available commands. This includes built-in commands (`/compress`, `/model`, `/title`) and every installed skill. You don't need to memorize anything — Tab completion has you covered.
:::tip
Use `/verbose` to cycle through tool output display modes: **off → new → all → verbose**. The "all" mode is great for watching what the agent does; "off" is cleanest for simple Q&A.
:::
## Context Files
### AGENTS.md: Your Project's Brain
Create an `AGENTS.md` in your project root with architecture decisions, coding conventions, and project-specific instructions. This is automatically injected into every session, so the agent always knows your project's rules.
```markdown
# Project Context
- This is a FastAPI backend with SQLAlchemy ORM
- Always use async/await for database operations
- Tests go in tests/ and use pytest-asyncio
- Never commit .env files
```
### SOUL.md: Customize Personality
Want Hermes to have a stable default voice? Edit `~/.hermes/SOUL.md` (or `$HERMES_HOME/SOUL.md` if you use a custom Hermes home). Hermes now seeds a starter SOUL automatically and uses that global file as the instance-wide personality source.
For a full walkthrough, see [Use SOUL.md with Hermes](/docs/guides/use-soul-with-hermes).
```markdown
# Soul
You are a senior backend engineer. Be terse and direct.
Skip explanations unless asked. Prefer one-liners over verbose solutions.
Always consider error handling and edge cases.
```
Use `SOUL.md` for durable personality. Use `AGENTS.md` for project-specific instructions.
### .cursorrules Compatibility
Already have a `.cursorrules` or `.cursor/rules/*.mdc` file? Hermes reads those too. No need to duplicate your coding conventions — they're loaded automatically from the working directory.
### Hierarchical Discovery
Hermes walks the directory tree and discovers **all** `AGENTS.md` files at every level. In a monorepo, put project-wide conventions at the root and team-specific ones in subdirectories — they're all concatenated together with path headers.
:::tip
Keep context files focused and concise. Every character counts against your token budget since they're injected into every single message.
:::
## Memory & Skills
### Memory vs. Skills: What Goes Where
**Memory** is for facts: your environment, preferences, project locations, and things the agent has learned about you. **Skills** are for procedures: multi-step workflows, tool-specific instructions, and reusable recipes. Use memory for "what," skills for "how."
### When to Create Skills
If you find a task that takes 5+ steps and you'll do it again, ask the agent to create a skill for it. Say "save what you just did as a skill called `deploy-staging`." Next time, just type `/deploy-staging` and the agent loads the full procedure.
### Managing Memory Capacity
Memory is intentionally bounded (~2,200 chars for MEMORY.md, ~1,375 chars for USER.md). When it fills up, the agent consolidates entries. You can help by saying "clean up your memory" or "replace the old Python 3.9 note — we're on 3.12 now."
### Let the Agent Remember
After a productive session, say "remember this for next time" and the agent will save the key takeaways. You can also be specific: "save to memory that our CI uses GitHub Actions with the `deploy.yml` workflow."
:::warning
Memory is a frozen snapshot — changes made during a session don't appear in the system prompt until the next session starts. The agent writes to disk immediately, but the prompt cache isn't invalidated mid-session.
:::
## Performance & Cost
### Don't Break the Prompt Cache
Most LLM providers cache the system prompt prefix. If you keep your system prompt stable (same context files, same memory), subsequent messages in a session get **cache hits** that are significantly cheaper. Avoid changing the model or system prompt mid-session.
### Use /compress Before Hitting Limits
Long sessions accumulate tokens. When you notice responses slowing down or getting truncated, run `/compress`. This summarizes the conversation history, preserving key context while dramatically reducing token count. Use `/usage` to check where you stand.
### Delegate for Parallel Work
Need to research three topics at once? Ask the agent to use `delegate_task` with parallel subtasks. Each subagent runs independently with its own context, and only the final summaries come back — massively reducing your main conversation's token usage.
### Use execute_code for Batch Operations
Instead of running terminal commands one at a time, ask the agent to write a script that does everything at once. "Write a Python script to rename all `.jpeg` files to `.jpg` and run it" is cheaper and faster than renaming files individually.
### Choose the Right Model
Use `/model` to switch models mid-session. Use a frontier model (Claude Sonnet/Opus, GPT-4o) for complex reasoning and architecture decisions. Switch to a faster model for simple tasks like formatting, renaming, or boilerplate generation.
:::tip
Run `/usage` periodically to see your token consumption. Run `/insights` for a broader view of usage patterns over the last 30 days.
:::
## Messaging Tips
### Set a Home Channel
Use `/sethome` in your preferred Telegram or Discord chat to designate it as the home channel. Cron job results and scheduled task outputs are delivered here. Without it, the agent has nowhere to send proactive messages.
### Use /title to Organize Sessions
Name your sessions with `/title auth-refactor` or `/title research-llm-quantization`. Named sessions are easy to find with `hermes sessions list` and resume with `hermes -r "auth-refactor"`. Unnamed sessions pile up and become impossible to distinguish.
### DM Pairing for Team Access
Instead of manually collecting user IDs for allowlists, enable DM pairing. When a teammate DMs the bot, they get a one-time pairing code. You approve it with `hermes pairing approve telegram XKGH5N7P` — simple and secure.
### Tool Progress Display Modes
Use `/verbose` to control how much tool activity you see. In messaging platforms, less is usually more — keep it on "new" to see just new tool calls. In the CLI, "all" gives you a satisfying live view of everything the agent does.
:::tip
On messaging platforms, sessions auto-reset after idle time (default: 24 hours) or daily at 4 AM. Adjust per-platform in `~/.hermes/config.yaml` if you need longer sessions.
:::
## Security
### Use Docker for Untrusted Code
When working with untrusted repositories or running unfamiliar code, use Docker or Daytona as your terminal backend. Set `TERMINAL_BACKEND=docker` in your `.env`. Destructive commands inside a container can't harm your host system.
```bash
# In your .env:
TERMINAL_BACKEND=docker
TERMINAL_DOCKER_IMAGE=hermes-sandbox:latest
```
### Avoid Windows Encoding Pitfalls
On Windows, some default encodings (such as `cp125x`) cannot represent all Unicode characters, which can cause `UnicodeEncodeError` when writing files in tests or scripts.
- Prefer opening files with an explicit UTF-8 encoding:
```python
with open("results.txt", "w", encoding="utf-8") as f:
f.write("✓ All good\n")
```
- In PowerShell, you can also switch the current session to UTF-8 for console and native command output:
```powershell
$OutputEncoding = [Console]::OutputEncoding = [Text.UTF8Encoding]::new($false)
```
This keeps PowerShell and child processes on UTF-8 and helps avoid Windows-only failures.
### Review Before Choosing "Always"
When the agent triggers a dangerous command approval (`rm -rf`, `DROP TABLE`, etc.), you get four options: **once**, **session**, **always**, **deny**. Think carefully before choosing "always" — it permanently allowlists that pattern. Start with "session" until you're comfortable.
### Command Approval Is Your Safety Net
Hermes checks every command against a curated list of dangerous patterns before execution. This includes recursive deletes, SQL drops, piping curl to shell, and more. Don't disable this in production — it exists for good reasons.
:::warning
When running in a container backend (Docker, Singularity, Modal, Daytona), dangerous command checks are **skipped** because the container is the security boundary. Make sure your container images are properly locked down.
:::
### Use Allowlists for Messaging Bots
Never set `GATEWAY_ALLOW_ALL_USERS=true` on a bot with terminal access. Always use platform-specific allowlists (`TELEGRAM_ALLOWED_USERS`, `DISCORD_ALLOWED_USERS`) or DM pairing to control who can interact with your agent.
```bash
# Recommended: explicit allowlists per platform
TELEGRAM_ALLOWED_USERS=123456789,987654321
DISCORD_ALLOWED_USERS=123456789012345678
# Or use cross-platform allowlist
GATEWAY_ALLOWED_USERS=123456789,987654321
```
---
*Have a tip that should be on this page? Open an issue or PR — community contributions are welcome.*

View file

@ -0,0 +1,415 @@
---
sidebar_position: 5
title: "Use MCP with Hermes"
description: "A practical guide to connecting MCP servers to Hermes Agent, filtering their tools, and using them safely in real workflows"
---
# Use MCP with Hermes
This guide shows how to actually use MCP with Hermes Agent in day-to-day workflows.
If the feature page explains what MCP is, this guide is about how to get value from it quickly and safely.
## When should you use MCP?
Use MCP when:
- a tool already exists in MCP form and you do not want to build a native Hermes tool
- you want Hermes to operate against a local or remote system through a clean RPC layer
- you want fine-grained per-server exposure control
- you want to connect Hermes to internal APIs, databases, or company systems without modifying Hermes core
Do not use MCP when:
- a built-in Hermes tool already solves the job well
- the server exposes a huge dangerous tool surface and you are not prepared to filter it
- you only need one very narrow integration and a native tool would be simpler and safer
## Mental model
Think of MCP as an adapter layer:
- Hermes remains the agent
- MCP servers contribute tools
- Hermes discovers those tools at startup or reload time
- the model can use them like normal tools
- you control how much of each server is visible
That last part matters. Good MCP usage is not just “connect everything.” It is “connect the right thing, with the smallest useful surface.”
## Step 1: install MCP support
If you installed Hermes with the standard install script, MCP support is already included (the installer runs `uv pip install -e ".[all]"`).
If you installed without extras and need to add MCP separately:
```bash
cd ~/.hermes/hermes-agent
uv pip install -e ".[mcp]"
```
For npm-based servers, make sure Node.js and `npx` are available.
For many Python MCP servers, `uvx` is a nice default.
## Step 2: add one server first
Start with a single, safe server.
Example: filesystem access to one project directory only.
```yaml
mcp_servers:
project_fs:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/my-project"]
```
Then start Hermes:
```bash
hermes chat
```
Now ask something concrete:
```text
Inspect this project and summarize the repo layout.
```
## Step 3: verify MCP loaded
You can verify MCP in a few ways:
- Hermes banner/status should show MCP integration when configured
- ask Hermes what tools it has available
- use `/reload-mcp` after config changes
- check logs if the server failed to connect
A practical test prompt:
```text
Tell me which MCP-backed tools are available right now.
```
## Step 4: start filtering immediately
Do not wait until later if the server exposes a lot of tools.
### Example: whitelist only what you want
```yaml
mcp_servers:
github:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
tools:
include: [list_issues, create_issue, search_code]
```
This is usually the best default for sensitive systems.
### Example: blacklist dangerous actions
```yaml
mcp_servers:
stripe:
url: "https://mcp.stripe.com"
headers:
Authorization: "Bearer ***"
tools:
exclude: [delete_customer, refund_payment]
```
### Example: disable utility wrappers too
```yaml
mcp_servers:
docs:
url: "https://mcp.docs.example.com"
tools:
prompts: false
resources: false
```
## What does filtering actually affect?
There are two categories of MCP-exposed functionality in Hermes:
1. Server-native MCP tools
- filtered with:
- `tools.include`
- `tools.exclude`
2. Hermes-added utility wrappers
- filtered with:
- `tools.resources`
- `tools.prompts`
### Utility wrappers you may see
Resources:
- `list_resources`
- `read_resource`
Prompts:
- `list_prompts`
- `get_prompt`
These wrappers only appear if:
- your config allows them, and
- the MCP server session actually supports those capabilities
So Hermes will not pretend a server has resources/prompts if it does not.
## Common patterns
### Pattern 1: local project assistant
Use MCP for a repo-local filesystem or git server when you want Hermes to reason over a bounded workspace.
```yaml
mcp_servers:
fs:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/project"]
git:
command: "uvx"
args: ["mcp-server-git", "--repository", "/home/user/project"]
```
Good prompts:
```text
Review the project structure and identify where configuration lives.
```
```text
Check the local git state and summarize what changed recently.
```
### Pattern 2: GitHub triage assistant
```yaml
mcp_servers:
github:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
tools:
include: [list_issues, create_issue, update_issue, search_code]
prompts: false
resources: false
```
Good prompts:
```text
List open issues about MCP, cluster them by theme, and draft a high-quality issue for the most common bug.
```
```text
Search the repo for uses of _discover_and_register_server and explain how MCP tools are registered.
```
### Pattern 3: internal API assistant
```yaml
mcp_servers:
internal_api:
url: "https://mcp.internal.example.com"
headers:
Authorization: "Bearer ***"
tools:
include: [list_customers, get_customer, list_invoices]
resources: false
prompts: false
```
Good prompts:
```text
Look up customer ACME Corp and summarize recent invoice activity.
```
This is the sort of place where a strict whitelist is far better than an exclude list.
### Pattern 4: documentation / knowledge servers
Some MCP servers expose prompts or resources that are more like shared knowledge assets than direct actions.
```yaml
mcp_servers:
docs:
url: "https://mcp.docs.example.com"
tools:
prompts: true
resources: true
```
Good prompts:
```text
List available MCP resources from the docs server, then read the onboarding guide and summarize it.
```
```text
List prompts exposed by the docs server and tell me which ones would help with incident response.
```
## Tutorial: end-to-end setup with filtering
Here is a practical progression.
### Phase 1: add GitHub MCP with a tight whitelist
```yaml
mcp_servers:
github:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
tools:
include: [list_issues, create_issue, search_code]
prompts: false
resources: false
```
Start Hermes and ask:
```text
Search the codebase for references to MCP and summarize the main integration points.
```
### Phase 2: expand only when needed
If you later need issue updates too:
```yaml
tools:
include: [list_issues, create_issue, update_issue, search_code]
```
Then reload:
```text
/reload-mcp
```
### Phase 3: add a second server with different policy
```yaml
mcp_servers:
github:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "***"
tools:
include: [list_issues, create_issue, update_issue, search_code]
prompts: false
resources: false
filesystem:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/project"]
```
Now Hermes can combine them:
```text
Inspect the local project files, then create a GitHub issue summarizing the bug you find.
```
That is where MCP gets powerful: multi-system workflows without changing Hermes core.
## Safe usage recommendations
### Prefer allowlists for dangerous systems
For anything financial, customer-facing, or destructive:
- use `tools.include`
- start with the smallest set possible
### Disable unused utilities
If you do not want the model browsing server-provided resources/prompts, turn them off:
```yaml
tools:
resources: false
prompts: false
```
### Keep servers scoped narrowly
Examples:
- filesystem server rooted to one project dir, not your whole home directory
- git server pointed at one repo
- internal API server with read-heavy tool exposure by default
### Reload after config changes
```text
/reload-mcp
```
Do this after changing:
- include/exclude lists
- enabled flags
- resources/prompts toggles
- auth headers / env
## Troubleshooting by symptom
### "The server connects but the tools I expected are missing"
Possible causes:
- filtered by `tools.include`
- excluded by `tools.exclude`
- utility wrappers disabled via `resources: false` or `prompts: false`
- server does not actually support resources/prompts
### "The server is configured but nothing loads"
Check:
- `enabled: false` was not left in config
- command/runtime exists (`npx`, `uvx`, etc.)
- HTTP endpoint is reachable
- auth env or headers are correct
### "Why do I see fewer tools than the MCP server advertises?"
Because Hermes now respects your per-server policy and capability-aware registration. That is expected, and usually desirable.
### "How do I remove an MCP server without deleting the config?"
Use:
```yaml
enabled: false
```
That keeps the config around but prevents connection and registration.
## Recommended first MCP setups
Good first servers for most users:
- filesystem
- git
- GitHub
- fetch / documentation MCP servers
- one narrow internal API
Not-great first servers:
- giant business systems with lots of destructive actions and no filtering
- anything you do not understand well enough to constrain
## Related docs
- [MCP (Model Context Protocol)](/docs/user-guide/features/mcp)
- [FAQ](/docs/reference/faq)
- [Slash Commands](/docs/reference/slash-commands)

View file

@ -0,0 +1,264 @@
---
sidebar_position: 6
title: "Use SOUL.md with Hermes"
description: "How to use SOUL.md to shape Hermes Agent's default voice, what belongs there, and how it differs from AGENTS.md and /personality"
---
# Use SOUL.md with Hermes
`SOUL.md` is the **primary identity** for your Hermes instance. It's the first thing in the system prompt — it defines who the agent is, how it speaks, and what it avoids.
If you want Hermes to feel like the same assistant every time you talk to it — or if you want to replace the Hermes persona entirely with your own — this is the file to use.
## What SOUL.md is for
Use `SOUL.md` for:
- tone
- personality
- communication style
- how direct or warm Hermes should be
- what Hermes should avoid stylistically
- how Hermes should relate to uncertainty, disagreement, and ambiguity
In short:
- `SOUL.md` is about who Hermes is and how Hermes speaks
## What SOUL.md is not for
Do not use it for:
- repo-specific coding conventions
- file paths
- commands
- service ports
- architecture notes
- project workflow instructions
Those belong in `AGENTS.md`.
A good rule:
- if it should apply everywhere, put it in `SOUL.md`
- if it only belongs to one project, put it in `AGENTS.md`
## Where it lives
Hermes now uses only the global SOUL file for the current instance:
```text
~/.hermes/SOUL.md
```
If you run Hermes with a custom home directory, it becomes:
```text
$HERMES_HOME/SOUL.md
```
## First-run behavior
Hermes automatically seeds a starter `SOUL.md` for you if one does not already exist.
That means most users now begin with a real file they can read and edit immediately.
Important:
- if you already have a `SOUL.md`, Hermes does not overwrite it
- if the file exists but is empty, Hermes adds nothing from it to the prompt
## How Hermes uses it
When Hermes starts a session, it reads `SOUL.md` from `HERMES_HOME`, scans it for prompt-injection patterns, truncates it if needed, and uses it as the **agent identity** — slot #1 in the system prompt. This means SOUL.md completely replaces the built-in default identity text.
If SOUL.md is missing, empty, or cannot be loaded, Hermes falls back to a built-in default identity.
No wrapper language is added around the file. The content itself matters — write the way you want your agent to think and speak.
## A good first edit
If you do nothing else, open the file and change just a few lines so it feels like you.
For example:
```markdown
You are direct, calm, and technically precise.
Prefer substance over politeness theater.
Push back clearly when an idea is weak.
Keep answers compact unless deeper detail is useful.
```
That alone can noticeably change how Hermes feels.
## Example styles
### 1. Pragmatic engineer
```markdown
You are a pragmatic senior engineer.
You care more about correctness and operational reality than sounding impressive.
## Style
- Be direct
- Be concise unless complexity requires depth
- Say when something is a bad idea
- Prefer practical tradeoffs over idealized abstractions
## Avoid
- Sycophancy
- Hype language
- Overexplaining obvious things
```
### 2. Research partner
```markdown
You are a thoughtful research collaborator.
You are curious, honest about uncertainty, and excited by unusual ideas.
## Style
- Explore possibilities without pretending certainty
- Distinguish speculation from evidence
- Ask clarifying questions when the idea space is underspecified
- Prefer conceptual depth over shallow completeness
```
### 3. Teacher / explainer
```markdown
You are a patient technical teacher.
You care about understanding, not performance.
## Style
- Explain clearly
- Use examples when they help
- Do not assume prior knowledge unless the user signals it
- Build from intuition to details
```
### 4. Tough reviewer
```markdown
You are a rigorous reviewer.
You are fair, but you do not soften important criticism.
## Style
- Point out weak assumptions directly
- Prioritize correctness over harmony
- Be explicit about risks and tradeoffs
- Prefer blunt clarity to vague diplomacy
```
## What makes a strong SOUL.md?
A strong `SOUL.md` is:
- stable
- broadly applicable
- specific in voice
- not overloaded with temporary instructions
A weak `SOUL.md` is:
- full of project details
- contradictory
- trying to micro-manage every response shape
- mostly generic filler like "be helpful" and "be clear"
Hermes already tries to be helpful and clear. `SOUL.md` should add real personality and style, not restate obvious defaults.
## Suggested structure
You do not need headings, but they help.
A simple structure that works well:
```markdown
# Identity
Who Hermes is.
# Style
How Hermes should sound.
# Avoid
What Hermes should not do.
# Defaults
How Hermes should behave when ambiguity appears.
```
## SOUL.md vs /personality
These are complementary.
Use `SOUL.md` for your durable baseline.
Use `/personality` for temporary mode switches.
Examples:
- your default SOUL is pragmatic and direct
- then for one session you use `/personality teacher`
- later you switch back without changing your base voice file
## SOUL.md vs AGENTS.md
This is the most common mistake.
### Put this in SOUL.md
- “Be direct.”
- “Avoid hype language.”
- “Prefer short answers unless depth helps.”
- “Push back when the user is wrong.”
### Put this in AGENTS.md
- “Use pytest, not unittest.”
- “Frontend lives in `frontend/`.”
- “Never edit migrations directly.”
- “The API runs on port 8000.”
## How to edit it
```bash
nano ~/.hermes/SOUL.md
```
or
```bash
vim ~/.hermes/SOUL.md
```
Then restart Hermes or start a new session.
## A practical workflow
1. Start with the seeded default file
2. Trim anything that does not feel like the voice you want
3. Add 48 lines that clearly define tone and defaults
4. Talk to Hermes for a while
5. Adjust based on what still feels off
That iterative approach works better than trying to design the perfect personality in one shot.
## Troubleshooting
### I edited SOUL.md but Hermes still sounds the same
Check:
- you edited `~/.hermes/SOUL.md` or `$HERMES_HOME/SOUL.md`
- not some repo-local `SOUL.md`
- the file is not empty
- your session was restarted after the edit
- a `/personality` overlay is not dominating the result
### Hermes is ignoring parts of my SOUL.md
Possible causes:
- higher-priority instructions are overriding it
- the file includes conflicting guidance
- the file is too long and got truncated
- some of the text resembles prompt-injection content and may be blocked or altered by the scanner
### My SOUL.md became too project-specific
Move project instructions into `AGENTS.md` and keep `SOUL.md` focused on identity and style.
## Related docs
- [Personality & SOUL.md](/docs/user-guide/features/personality)
- [Context Files](/docs/user-guide/features/context-files)
- [Configuration](/docs/user-guide/configuration)
- [Tips & Best Practices](/docs/guides/tips)

View file

@ -0,0 +1,454 @@
---
sidebar_position: 7
title: "Use Voice Mode with Hermes"
description: "A practical guide to setting up and using Hermes voice mode across CLI, Telegram, Discord, and Discord voice channels"
---
# Use Voice Mode with Hermes
This guide is the practical companion to the [Voice Mode feature reference](/docs/user-guide/features/voice-mode).
If the feature page explains what voice mode can do, this guide shows how to actually use it well.
## What voice mode is good for
Voice mode is especially useful when:
- you want a hands-free CLI workflow
- you want spoken responses in Telegram or Discord
- you want Hermes sitting in a Discord voice channel for live conversation
- you want quick idea capture, debugging, or back-and-forth while walking around instead of typing
## Choose your voice mode setup
There are really three different voice experiences in Hermes.
| Mode | Best for | Platform |
|---|---|---|
| Interactive microphone loop | Personal hands-free use while coding or researching | CLI |
| Voice replies in chat | Spoken responses alongside normal messaging | Telegram, Discord |
| Live voice channel bot | Group or personal live conversation in a VC | Discord voice channels |
A good path is:
1. get text working first
2. enable voice replies second
3. move to Discord voice channels last if you want the full experience
## Step 1: make sure normal Hermes works first
Before touching voice mode, verify that:
- Hermes starts
- your provider is configured
- the agent can answer text prompts normally
```bash
hermes
```
Ask something simple:
```text
What tools do you have available?
```
If that is not solid yet, fix text mode first.
## Step 2: install the right extras
### CLI microphone + playback
```bash
pip install "hermes-agent[voice]"
```
### Messaging platforms
```bash
pip install "hermes-agent[messaging]"
```
### Premium ElevenLabs TTS
```bash
pip install "hermes-agent[tts-premium]"
```
### Local NeuTTS (optional)
```bash
python -m pip install -U neutts[all]
```
### Everything
```bash
pip install "hermes-agent[all]"
```
## Step 3: install system dependencies
### macOS
```bash
brew install portaudio ffmpeg opus
brew install espeak-ng
```
### Ubuntu / Debian
```bash
sudo apt install portaudio19-dev ffmpeg libopus0
sudo apt install espeak-ng
```
Why these matter:
- `portaudio` → microphone input / playback for CLI voice mode
- `ffmpeg` → audio conversion for TTS and messaging delivery
- `opus` → Discord voice codec support
- `espeak-ng` → phonemizer backend for NeuTTS
## Step 4: choose STT and TTS providers
Hermes supports both local and cloud speech stacks.
### Easiest / cheapest setup
Use local STT and free Edge TTS:
- STT provider: `local`
- TTS provider: `edge`
This is usually the best place to start.
### Environment file example
Add to `~/.hermes/.env`:
```bash
# Cloud STT options (local needs no key)
GROQ_API_KEY=***
VOICE_TOOLS_OPENAI_KEY=***
# Premium TTS (optional)
ELEVENLABS_API_KEY=***
```
### Provider recommendations
#### Speech-to-text
- `local` → best default for privacy and zero-cost use
- `groq` → very fast cloud transcription
- `openai` → good paid fallback
#### Text-to-speech
- `edge` → free and good enough for most users
- `neutts` → free local/on-device TTS
- `elevenlabs` → best quality
- `openai` → good middle ground
### If you use `hermes setup`
If you choose NeuTTS in the setup wizard, Hermes checks whether `neutts` is already installed. If it is missing, the wizard tells you NeuTTS needs the Python package `neutts` and the system package `espeak-ng`, offers to install them for you, installs `espeak-ng` with your platform package manager, and then runs:
```bash
python -m pip install -U neutts[all]
```
If you skip that install or it fails, the wizard falls back to Edge TTS.
## Step 5: recommended config
```yaml
voice:
record_key: "ctrl+b"
max_recording_seconds: 120
auto_tts: false
silence_threshold: 200
silence_duration: 3.0
stt:
provider: "local"
local:
model: "base"
tts:
provider: "edge"
edge:
voice: "en-US-AriaNeural"
```
This is a good conservative default for most people.
If you want local TTS instead, switch the `tts` block to:
```yaml
tts:
provider: "neutts"
neutts:
ref_audio: ''
ref_text: ''
model: neuphonic/neutts-air-q4-gguf
device: cpu
```
## Use case 1: CLI voice mode
## Turn it on
Start Hermes:
```bash
hermes
```
Inside the CLI:
```text
/voice on
```
### Recording flow
Default key:
- `Ctrl+B`
Workflow:
1. press `Ctrl+B`
2. speak
3. wait for silence detection to stop recording automatically
4. Hermes transcribes and responds
5. if TTS is on, it speaks the answer
6. the loop can automatically restart for continuous use
### Useful commands
```text
/voice
/voice on
/voice off
/voice tts
/voice status
```
### Good CLI workflows
#### Walk-up debugging
Say:
```text
I keep getting a docker permission error. Help me debug it.
```
Then continue hands-free:
- "Read the last error again"
- "Explain the root cause in simpler terms"
- "Now give me the exact fix"
#### Research / brainstorming
Great for:
- walking around while thinking
- dictating half-formed ideas
- asking Hermes to structure your thoughts in real time
#### Accessibility / low-typing sessions
If typing is inconvenient, voice mode is one of the fastest ways to stay in the full Hermes loop.
## Tuning CLI behavior
### Silence threshold
If Hermes starts/stops too aggressively, tune:
```yaml
voice:
silence_threshold: 250
```
Higher threshold = less sensitive.
### Silence duration
If you pause a lot between sentences, increase:
```yaml
voice:
silence_duration: 4.0
```
### Record key
If `Ctrl+B` conflicts with your terminal or tmux habits:
```yaml
voice:
record_key: "ctrl+space"
```
## Use case 2: voice replies in Telegram or Discord
This mode is simpler than full voice channels.
Hermes stays a normal chat bot, but can speak replies.
### Start the gateway
```bash
hermes gateway
```
### Turn on voice replies
Inside Telegram or Discord:
```text
/voice on
```
or
```text
/voice tts
```
### Modes
| Mode | Meaning |
|---|---|
| `off` | text only |
| `voice_only` | speak only when the user sent voice |
| `all` | speak every reply |
### When to use which mode
- `/voice on` if you want spoken replies only for voice-originating messages
- `/voice tts` if you want a full spoken assistant all the time
### Good messaging workflows
#### Telegram assistant on your phone
Use when:
- you are away from your machine
- you want to send voice notes and get quick spoken replies
- you want Hermes to function like a portable research or ops assistant
#### Discord DMs with spoken output
Useful when you want private interaction without server-channel mention behavior.
## Use case 3: Discord voice channels
This is the most advanced mode.
Hermes joins a Discord VC, listens to user speech, transcribes it, runs the normal agent pipeline, and speaks replies back into the channel.
## Required Discord permissions
In addition to the normal text-bot setup, make sure the bot has:
- Connect
- Speak
- preferably Use Voice Activity
Also enable privileged intents in the Developer Portal:
- Presence Intent
- Server Members Intent
- Message Content Intent
## Join and leave
In a Discord text channel where the bot is present:
```text
/voice join
/voice leave
/voice status
```
### What happens when joined
- users speak in the VC
- Hermes detects speech boundaries
- transcripts are posted in the associated text channel
- Hermes responds in text and audio
- the text channel is the one where `/voice join` was issued
### Best practices for Discord VC use
- keep `DISCORD_ALLOWED_USERS` tight
- use a dedicated bot/testing channel at first
- verify STT and TTS work in ordinary text-chat voice mode before trying VC mode
## Voice quality recommendations
### Best quality setup
- STT: local `large-v3` or Groq `whisper-large-v3`
- TTS: ElevenLabs
### Best speed / convenience setup
- STT: local `base` or Groq
- TTS: Edge
### Best zero-cost setup
- STT: local
- TTS: Edge
## Common failure modes
### "No audio device found"
Install `portaudio`.
### "Bot joins but hears nothing"
Check:
- your Discord user ID is in `DISCORD_ALLOWED_USERS`
- you are not muted
- privileged intents are enabled
- the bot has Connect/Speak permissions
### "It transcribes but does not speak"
Check:
- TTS provider config
- API key / quota for ElevenLabs or OpenAI
- `ffmpeg` install for Edge conversion paths
### "Whisper outputs garbage"
Try:
- quieter environment
- higher `silence_threshold`
- different STT provider/model
- shorter, clearer utterances
### "It works in DMs but not in server channels"
That is often mention policy.
By default, the bot needs an `@mention` in Discord server text channels unless configured otherwise.
## Suggested first-week setup
If you want the shortest path to success:
1. get text Hermes working
2. install `hermes-agent[voice]`
3. use CLI voice mode with local STT + Edge TTS
4. then enable `/voice on` in Telegram or Discord
5. only after that, try Discord VC mode
That progression keeps the debugging surface small.
## Where to read next
- [Voice Mode feature reference](/docs/user-guide/features/voice-mode)
- [Messaging Gateway](/docs/user-guide/messaging)
- [Discord setup](/docs/user-guide/messaging/discord)
- [Telegram setup](/docs/user-guide/messaging/telegram)
- [Configuration](/docs/user-guide/configuration)