docs: improve voice mode docs with prerequisites, startup commands, and platform links

This commit is contained in:
0xbyt4 2026-03-11 15:29:23 +03:00
parent 2bb2312ea2
commit 75bd5a582b

View file

@ -8,6 +8,18 @@ description: "Real-time voice conversations with Hermes Agent — CLI, Telegram,
Hermes Agent supports full voice interaction across CLI and messaging platforms. Talk to the agent using your microphone, hear spoken replies, and have live voice conversations in Discord voice channels. Hermes Agent supports full voice interaction across CLI and messaging platforms. Talk to the agent using your microphone, hear spoken replies, and have live voice conversations in Discord voice channels.
## Prerequisites
Before using voice features, make sure you have:
1. **Hermes Agent installed**`pip install hermes-agent` (see [Getting Started](../../getting-started.md))
2. **An LLM provider configured** — set `OPENAI_API_KEY`, `OPENAI_BASE_URL`, and `LLM_MODEL` in `~/.hermes/.env`
3. **A working base setup** — run `hermes` to verify the agent responds to text before enabling voice
:::tip
The `~/.hermes/` directory and default `config.yaml` are created automatically the first time you run `hermes`. You only need to create `~/.hermes/.env` manually for API keys.
:::
## Overview ## Overview
| Feature | Platform | Description | | Feature | Platform | Description |
@ -79,6 +91,14 @@ ELEVENLABS_API_KEY=your-key # ElevenLabs — premium quality
### Quick Start ### Quick Start
Start the CLI and enable voice mode:
```bash
hermes # Start the interactive CLI
```
Then use these commands inside the CLI:
``` ```
/voice Toggle voice mode on/off /voice Toggle voice mode on/off
/voice on Enable voice mode /voice on Enable voice mode
@ -89,7 +109,7 @@ ELEVENLABS_API_KEY=your-key # ElevenLabs — premium quality
### How It Works ### How It Works
1. Enable voice mode with `/voice on` 1. Start the CLI with `hermes` and enable voice mode with `/voice on`
2. **Press Ctrl+B** — a beep plays (880Hz), recording starts 2. **Press Ctrl+B** — a beep plays (880Hz), recording starts
3. **Speak** — a live audio level bar shows your input: `● [▁▂▃▅▇▇▅▂] ` 3. **Speak** — a live audio level bar shows your input: `● [▁▂▃▅▇▇▅▂] `
4. **Stop speaking** — after 3 seconds of silence, recording auto-stops 4. **Stop speaking** — after 3 seconds of silence, recording auto-stops
@ -125,12 +145,23 @@ When TTS is enabled, the agent speaks its reply **sentence-by-sentence** as it g
### Hallucination Filter ### Hallucination Filter
Whisper sometimes generates phantom text from silence or background noise ("Thank you for watching", "Subscribe", etc.). The agent filters these out using a database of 498+ known hallucination phrases across multiple languages. Whisper sometimes generates phantom text from silence or background noise ("Thank you for watching", "Subscribe", etc.). The agent filters these out using a set of 26 known hallucination phrases across multiple languages, plus a regex pattern that catches repetitive variations.
--- ---
## Gateway Voice Reply (Telegram & Discord) ## Gateway Voice Reply (Telegram & Discord)
If you haven't set up your messaging bots yet, see the platform-specific guides:
- [Telegram Setup Guide](../messaging/telegram.md)
- [Discord Setup Guide](../messaging/discord.md)
Start the gateway to connect to your messaging platforms:
```bash
hermes gateway # Start the gateway (connects to configured platforms)
hermes gateway setup # Interactive setup wizard for first-time configuration
```
### Commands ### Commands
These work in both Telegram and Discord text channels: These work in both Telegram and Discord text channels:
@ -245,8 +276,18 @@ GROQ_API_KEY=your-key # Recommended (fast, free tier)
# ELEVENLABS_API_KEY=your-key # Premium quality # ELEVENLABS_API_KEY=your-key # Premium quality
``` ```
### Start the Gateway
```bash
hermes gateway # Start with existing configuration
```
The bot should come online in Discord within a few seconds.
### Commands ### Commands
Use these in the Discord text channel where the bot is present:
``` ```
/voice join Bot joins your current voice channel /voice join Bot joins your current voice channel
/voice channel Alias for /voice join /voice channel Alias for /voice join