Add messaging platform enhancements: STT, stickers, Discord UX, Slack, pairing, hooks

Major feature additions inspired by OpenClaw/ClawdBot integration analysis:

Voice Message Transcription (STT):
- Auto-transcribe voice/audio messages via OpenAI Whisper API
- Download voice to ~/.hermes/audio_cache/ on Telegram/Discord/WhatsApp
- Inject transcript as text so all models can understand voice input
- Configurable model (whisper-1, gpt-4o-mini-transcribe, gpt-4o-transcribe)

Telegram Sticker Understanding:
- Describe static stickers via vision tool with JSON-backed cache
- Cache keyed by file_unique_id avoids redundant API calls
- Animated/video stickers get emoji-based fallback description

Discord Rich UX:
- Native slash commands (/ask, /reset, /status, /stop) via app_commands
- Button-based exec approvals (Allow Once / Always Allow / Deny)
- ExecApprovalView with user authorization and timeout handling

Slack Integration:
- Full SlackAdapter using slack-bolt with Socket Mode
- DMs, channel messages (mention-gated), /hermes slash command
- File attachment handling with bot-token-authenticated downloads

DM Pairing System:
- Code-based user authorization as alternative to static allowlists
- 8-char codes from unambiguous alphabet, 1-hour expiry
- Rate limiting, lockout after failed attempts, chmod 0600 on data
- CLI: hermes pairing list/approve/revoke/clear-pending

Event Hook System:
- File-based hook discovery from ~/.hermes/hooks/
- HOOK.yaml + handler.py per hook, sync/async handler support
- Events: gateway:startup, session:start/reset, agent:start/step/end
- Wildcard matching (command:* catches all command events)

Cross-Channel Messaging:
- send_message agent tool for delivering to any connected platform
- Enables cron job delivery and cross-platform notifications

Human-Like Response Pacing:
- Configurable delays between message chunks (off/natural/custom)
- HERMES_HUMAN_DELAY_MODE env var with min/max ms settings

Warm Injection Message Style:
- Retrofitted image vision messages with friendly kawaii-consistent tone
- All new injection messages (STT, stickers, errors) use warm style

Also: updated config migration to prompt for optional keys interactively,
bumped config version, updated README, AGENTS.md, .env.example,
cli-config.yaml.example, install scripts, pyproject.toml, and toolsets.
This commit is contained in:
teknium1 2026-02-15 21:38:59 -08:00
parent 5404a8fcd8
commit 69aa35a51c
23 changed files with 2080 additions and 32 deletions

View file

@ -903,6 +903,33 @@ def get_tts_tool_definitions() -> List[Dict[str, Any]]:
]
def get_send_message_tool_definitions():
"""Tool definitions for cross-channel messaging."""
return [
{
"type": "function",
"function": {
"name": "send_message",
"description": "Send a message to a user or channel on any connected messaging platform. Use this when the user asks you to send something to a different platform, or when delivering notifications/alerts to a specific destination.",
"parameters": {
"type": "object",
"properties": {
"target": {
"type": "string",
"description": "Delivery target. Format: 'platform' (uses home channel) or 'platform:chat_id' (specific chat). Examples: 'telegram', 'discord:123456789', 'slack:C01234ABCDE'"
},
"message": {
"type": "string",
"description": "The message text to send"
}
},
"required": ["target", "message"]
}
}
}
]
def get_all_tool_names() -> List[str]:
"""
Get the names of all available tools across all toolsets.
@ -971,6 +998,9 @@ def get_all_tool_names() -> List[str]:
if check_tts_requirements():
tool_names.extend(["text_to_speech"])
# Cross-channel messaging (always available on messaging platforms)
tool_names.extend(["send_message"])
return tool_names
@ -1019,6 +1049,8 @@ TOOL_TO_TOOLSET_MAP = {
"write_file": "file_tools",
"patch": "file_tools",
"search": "file_tools",
# Cross-channel messaging
"send_message": "messaging_tools",
}
@ -1122,6 +1154,10 @@ def get_tool_definitions(
for tool in get_tts_tool_definitions():
all_available_tools_map[tool["function"]["name"]] = tool
# Cross-channel messaging (always available on messaging platforms)
for tool in get_send_message_tool_definitions():
all_available_tools_map[tool["function"]["name"]] = tool
# Determine which tools to include based on toolsets
tools_to_include = set()
@ -1693,6 +1729,22 @@ def handle_tts_function_call(
return json.dumps({"error": f"Unknown TTS function: {function_name}"}, ensure_ascii=False)
def handle_send_message_function_call(function_name, function_args):
"""Handle cross-channel send_message tool calls."""
import json
target = function_args.get("target", "")
message = function_args.get("message", "")
if not target or not message:
return json.dumps({"error": "Both 'target' and 'message' are required"})
# Store the pending message for the gateway to deliver
# The gateway runner checks this after the agent loop completes
import os
os.environ["_HERMES_PENDING_SEND_TARGET"] = target
os.environ["_HERMES_PENDING_SEND_MESSAGE"] = message
return json.dumps({"success": True, "delivered_to": target, "note": "Message queued for delivery"})
def handle_function_call(
function_name: str,
function_args: Dict[str, Any],
@ -1774,6 +1826,10 @@ def handle_function_call(
elif function_name in ["text_to_speech"]:
return handle_tts_function_call(function_name, function_args)
# Route cross-channel messaging
elif function_name == "send_message":
return handle_send_message_function_call(function_name, function_args)
else:
error_msg = f"Unknown function: {function_name}"
print(f"{error_msg}")