The architecture has been updated

This commit is contained in:
Skyber_2 2026-03-31 23:31:36 +03:00
parent 805f7a017e
commit a01257ead9
1119 changed files with 226 additions and 352 deletions

View file

@ -0,0 +1,313 @@
# Adding a New Messaging Platform
Checklist for integrating a new messaging platform into the Hermes gateway.
Use this as a reference when building a new adapter — every item here is a
real integration point that exists in the codebase. Missing any of them will
cause broken functionality, missing features, or inconsistent behavior.
---
## 1. Core Adapter (`gateway/platforms/<platform>.py`)
The adapter is a subclass of `BasePlatformAdapter` from `gateway/platforms/base.py`.
### Required methods
| Method | Purpose |
|--------|---------|
| `__init__(self, config)` | Parse config, init state. Call `super().__init__(config, Platform.YOUR_PLATFORM)` |
| `connect() -> bool` | Connect to the platform, start listeners. Return True on success |
| `disconnect()` | Stop listeners, close connections, cancel tasks |
| `send(chat_id, text, ...) -> SendResult` | Send a text message |
| `send_typing(chat_id)` | Send typing indicator |
| `send_image(chat_id, image_url, caption) -> SendResult` | Send an image |
| `get_chat_info(chat_id) -> dict` | Return `{name, type, chat_id}` for a chat |
### Optional methods (have default stubs in base)
| Method | Purpose |
|--------|---------|
| `send_document(chat_id, path, caption)` | Send a file attachment |
| `send_voice(chat_id, path)` | Send a voice message |
| `send_video(chat_id, path, caption)` | Send a video |
| `send_animation(chat_id, path, caption)` | Send a GIF/animation |
| `send_image_file(chat_id, path, caption)` | Send image from local file |
### Required function
```python
def check_<platform>_requirements() -> bool:
"""Check if this platform's dependencies are available."""
```
### Key patterns to follow
- Use `self.build_source(...)` to construct `SessionSource` objects
- Call `self.handle_message(event)` to dispatch inbound messages to the gateway
- Use `MessageEvent`, `MessageType`, `SendResult` from base
- Use `cache_image_from_bytes`, `cache_audio_from_bytes`, `cache_document_from_bytes` for attachments
- Filter self-messages (prevent reply loops)
- Filter sync/echo messages if the platform has them
- Redact sensitive identifiers (phone numbers, tokens) in all log output
- Implement reconnection with exponential backoff + jitter for streaming connections
- Set `MAX_MESSAGE_LENGTH` if the platform has message size limits
---
## 2. Platform Enum (`gateway/config.py`)
Add the platform to the `Platform` enum:
```python
class Platform(Enum):
...
YOUR_PLATFORM = "your_platform"
```
Add env var loading in `_apply_env_overrides()`:
```python
# Your Platform
your_token = os.getenv("YOUR_PLATFORM_TOKEN")
if your_token:
if Platform.YOUR_PLATFORM not in config.platforms:
config.platforms[Platform.YOUR_PLATFORM] = PlatformConfig()
config.platforms[Platform.YOUR_PLATFORM].enabled = True
config.platforms[Platform.YOUR_PLATFORM].token = your_token
```
Update `get_connected_platforms()` if your platform doesn't use token/api_key
(e.g., WhatsApp uses `enabled` flag, Signal uses `extra` dict).
---
## 3. Adapter Factory (`gateway/run.py`)
Add to `_create_adapter()`:
```python
elif platform == Platform.YOUR_PLATFORM:
from gateway.platforms.your_platform import YourAdapter, check_your_requirements
if not check_your_requirements():
logger.warning("Your Platform: dependencies not met")
return None
return YourAdapter(config)
```
---
## 4. Authorization Maps (`gateway/run.py`)
Add to BOTH dicts in `_is_user_authorized()`:
```python
platform_env_map = {
...
Platform.YOUR_PLATFORM: "YOUR_PLATFORM_ALLOWED_USERS",
}
platform_allow_all_map = {
...
Platform.YOUR_PLATFORM: "YOUR_PLATFORM_ALLOW_ALL_USERS",
}
```
---
## 5. Session Source (`gateway/session.py`)
If your platform needs extra identity fields (e.g., Signal's UUID alongside
phone number), add them to the `SessionSource` dataclass with `Optional` defaults,
and update `to_dict()`, `from_dict()`, and `build_source()` in base.py.
---
## 6. System Prompt Hints (`agent/prompt_builder.py`)
Add a `PLATFORM_HINTS` entry so the agent knows what platform it's on:
```python
PLATFORM_HINTS = {
...
"your_platform": (
"You are on Your Platform. "
"Describe formatting capabilities, media support, etc."
),
}
```
Without this, the agent won't know it's on your platform and may use
inappropriate formatting (e.g., markdown on platforms that don't render it).
---
## 7. Toolset (`toolsets.py`)
Add a named toolset for your platform:
```python
"hermes-your-platform": {
"description": "Your Platform bot toolset",
"tools": _HERMES_CORE_TOOLS,
"includes": []
},
```
And add it to the `hermes-gateway` composite:
```python
"hermes-gateway": {
"includes": [..., "hermes-your-platform"]
}
```
---
## 8. Cron Delivery (`cron/scheduler.py`)
Add to `platform_map` in `_deliver_result()`:
```python
platform_map = {
...
"your_platform": Platform.YOUR_PLATFORM,
}
```
Without this, `cronjob(action="create", deliver="your_platform", ...)` silently fails.
---
## 9. Send Message Tool (`tools/send_message_tool.py`)
Add to `platform_map` in `send_message_tool()`:
```python
platform_map = {
...
"your_platform": Platform.YOUR_PLATFORM,
}
```
Add routing in `_send_to_platform()`:
```python
elif platform == Platform.YOUR_PLATFORM:
return await _send_your_platform(pconfig, chat_id, message)
```
Implement `_send_your_platform()` — a standalone async function that sends
a single message without requiring the full adapter (for use by cron jobs
and the send_message tool outside the gateway process).
Update the tool schema `target` description to include your platform example.
---
## 10. Cronjob Tool Schema (`tools/cronjob_tools.py`)
Update the `deliver` parameter description and docstring to mention your
platform as a delivery option.
---
## 11. Channel Directory (`gateway/channel_directory.py`)
If your platform can't enumerate chats (most can't), add it to the
session-based discovery list:
```python
for plat_name in ("telegram", "whatsapp", "signal", "your_platform"):
```
---
## 12. Status Display (`hermes_cli/status.py`)
Add to the `platforms` dict in the Messaging Platforms section:
```python
platforms = {
...
"Your Platform": ("YOUR_PLATFORM_TOKEN", "YOUR_PLATFORM_HOME_CHANNEL"),
}
```
---
## 13. Gateway Setup Wizard (`hermes_cli/gateway.py`)
Add to the `_PLATFORMS` list:
```python
{
"key": "your_platform",
"label": "Your Platform",
"emoji": "📱",
"token_var": "YOUR_PLATFORM_TOKEN",
"setup_instructions": [...],
"vars": [...],
}
```
If your platform needs custom setup logic (connectivity testing, QR codes,
policy choices), add a `_setup_your_platform()` function and route to it
in the platform selection switch.
Update `_platform_status()` if your platform's "configured" check differs
from the standard `bool(get_env_value(token_var))`.
---
## 14. Phone/ID Redaction (`agent/redact.py`)
If your platform uses sensitive identifiers (phone numbers, etc.), add a
regex pattern and redaction function to `agent/redact.py`. This ensures
identifiers are masked in ALL log output, not just your adapter's logs.
---
## 15. Documentation
| File | What to update |
|------|---------------|
| `README.md` | Platform list in feature table + documentation table |
| `AGENTS.md` | Gateway description + env var config section |
| `website/docs/user-guide/messaging/<platform>.md` | **NEW** — Full setup guide (see existing platform docs for template) |
| `website/docs/user-guide/messaging/index.md` | Architecture diagram, toolset table, security examples, Next Steps links |
| `website/docs/reference/environment-variables.md` | All env vars for the platform |
---
## 16. Tests (`tests/gateway/test_<platform>.py`)
Recommended test coverage:
- Platform enum exists with correct value
- Config loading from env vars via `_apply_env_overrides`
- Adapter init (config parsing, allowlist handling, default values)
- Helper functions (redaction, parsing, file type detection)
- Session source round-trip (to_dict → from_dict)
- Authorization integration (platform in allowlist maps)
- Send message tool routing (platform in platform_map)
Optional but valuable:
- Async tests for message handling flow (mock the platform API)
- SSE/WebSocket reconnection logic
- Attachment processing
- Group message filtering
---
## Quick Verification
After implementing everything, verify with:
```bash
# All tests pass
python -m pytest tests/ -q
# Grep for your platform name to find any missed integration points
grep -r "telegram\|discord\|whatsapp\|slack" gateway/ tools/ agent/ cron/ hermes_cli/ toolsets.py \
--include="*.py" -l | sort -u
# Check each file in the output — if it mentions other platforms but not yours, you missed it
```

View file

@ -0,0 +1,17 @@
"""
Platform adapters for messaging integrations.
Each adapter handles:
- Receiving messages from a platform
- Sending messages/responses back
- Platform-specific authentication
- Message formatting and media handling
"""
from .base import BasePlatformAdapter, MessageEvent, SendResult
__all__ = [
"BasePlatformAdapter",
"MessageEvent",
"SendResult",
]

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,340 @@
"""
DingTalk platform adapter using Stream Mode.
Uses dingtalk-stream SDK for real-time message reception without webhooks.
Responses are sent via DingTalk's session webhook (markdown format).
Requires:
pip install dingtalk-stream httpx
DINGTALK_CLIENT_ID and DINGTALK_CLIENT_SECRET env vars
Configuration in config.yaml:
platforms:
dingtalk:
enabled: true
extra:
client_id: "your-app-key" # or DINGTALK_CLIENT_ID env var
client_secret: "your-secret" # or DINGTALK_CLIENT_SECRET env var
"""
import asyncio
import logging
import os
import time
import uuid
from datetime import datetime, timezone
from typing import Any, Dict, Optional
try:
import dingtalk_stream
from dingtalk_stream import ChatbotHandler, ChatbotMessage
DINGTALK_STREAM_AVAILABLE = True
except ImportError:
DINGTALK_STREAM_AVAILABLE = False
dingtalk_stream = None # type: ignore[assignment]
try:
import httpx
HTTPX_AVAILABLE = True
except ImportError:
HTTPX_AVAILABLE = False
httpx = None # type: ignore[assignment]
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
)
logger = logging.getLogger(__name__)
MAX_MESSAGE_LENGTH = 20000
DEDUP_WINDOW_SECONDS = 300
DEDUP_MAX_SIZE = 1000
RECONNECT_BACKOFF = [2, 5, 10, 30, 60]
def check_dingtalk_requirements() -> bool:
"""Check if DingTalk dependencies are available and configured."""
if not DINGTALK_STREAM_AVAILABLE or not HTTPX_AVAILABLE:
return False
if not os.getenv("DINGTALK_CLIENT_ID") or not os.getenv("DINGTALK_CLIENT_SECRET"):
return False
return True
class DingTalkAdapter(BasePlatformAdapter):
"""DingTalk chatbot adapter using Stream Mode.
The dingtalk-stream SDK maintains a long-lived WebSocket connection.
Incoming messages arrive via a ChatbotHandler callback. Replies are
sent via the incoming message's session_webhook URL using httpx.
"""
MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.DINGTALK)
extra = config.extra or {}
self._client_id: str = extra.get("client_id") or os.getenv("DINGTALK_CLIENT_ID", "")
self._client_secret: str = extra.get("client_secret") or os.getenv("DINGTALK_CLIENT_SECRET", "")
self._stream_client: Any = None
self._stream_task: Optional[asyncio.Task] = None
self._http_client: Optional["httpx.AsyncClient"] = None
# Message deduplication: msg_id -> timestamp
self._seen_messages: Dict[str, float] = {}
# Map chat_id -> session_webhook for reply routing
self._session_webhooks: Dict[str, str] = {}
# -- Connection lifecycle -----------------------------------------------
async def connect(self) -> bool:
"""Connect to DingTalk via Stream Mode."""
if not DINGTALK_STREAM_AVAILABLE:
logger.warning("[%s] dingtalk-stream not installed. Run: pip install dingtalk-stream", self.name)
return False
if not HTTPX_AVAILABLE:
logger.warning("[%s] httpx not installed. Run: pip install httpx", self.name)
return False
if not self._client_id or not self._client_secret:
logger.warning("[%s] DINGTALK_CLIENT_ID and DINGTALK_CLIENT_SECRET required", self.name)
return False
try:
self._http_client = httpx.AsyncClient(timeout=30.0)
credential = dingtalk_stream.Credential(self._client_id, self._client_secret)
self._stream_client = dingtalk_stream.DingTalkStreamClient(credential)
# Capture the current event loop for cross-thread dispatch
loop = asyncio.get_running_loop()
handler = _IncomingHandler(self, loop)
self._stream_client.register_callback_handler(
dingtalk_stream.ChatbotMessage.TOPIC, handler
)
self._stream_task = asyncio.create_task(self._run_stream())
self._mark_connected()
logger.info("[%s] Connected via Stream Mode", self.name)
return True
except Exception as e:
logger.error("[%s] Failed to connect: %s", self.name, e)
return False
async def _run_stream(self) -> None:
"""Run the blocking stream client with auto-reconnection."""
backoff_idx = 0
while self._running:
try:
logger.debug("[%s] Starting stream client...", self.name)
await asyncio.to_thread(self._stream_client.start)
except asyncio.CancelledError:
return
except Exception as e:
if not self._running:
return
logger.warning("[%s] Stream client error: %s", self.name, e)
if not self._running:
return
delay = RECONNECT_BACKOFF[min(backoff_idx, len(RECONNECT_BACKOFF) - 1)]
logger.info("[%s] Reconnecting in %ds...", self.name, delay)
await asyncio.sleep(delay)
backoff_idx += 1
async def disconnect(self) -> None:
"""Disconnect from DingTalk."""
self._running = False
self._mark_disconnected()
if self._stream_task:
self._stream_task.cancel()
try:
await self._stream_task
except asyncio.CancelledError:
pass
self._stream_task = None
if self._http_client:
await self._http_client.aclose()
self._http_client = None
self._stream_client = None
self._session_webhooks.clear()
self._seen_messages.clear()
logger.info("[%s] Disconnected", self.name)
# -- Inbound message processing -----------------------------------------
async def _on_message(self, message: "ChatbotMessage") -> None:
"""Process an incoming DingTalk chatbot message."""
msg_id = getattr(message, "message_id", None) or uuid.uuid4().hex
if self._is_duplicate(msg_id):
logger.debug("[%s] Duplicate message %s, skipping", self.name, msg_id)
return
text = self._extract_text(message)
if not text:
logger.debug("[%s] Empty message, skipping", self.name)
return
# Chat context
conversation_id = getattr(message, "conversation_id", "") or ""
conversation_type = getattr(message, "conversation_type", "1")
is_group = str(conversation_type) == "2"
sender_id = getattr(message, "sender_id", "") or ""
sender_nick = getattr(message, "sender_nick", "") or sender_id
sender_staff_id = getattr(message, "sender_staff_id", "") or ""
chat_id = conversation_id or sender_id
chat_type = "group" if is_group else "dm"
# Store session webhook for reply routing
session_webhook = getattr(message, "session_webhook", None) or ""
if session_webhook and chat_id:
self._session_webhooks[chat_id] = session_webhook
source = self.build_source(
chat_id=chat_id,
chat_name=getattr(message, "conversation_title", None),
chat_type=chat_type,
user_id=sender_id,
user_name=sender_nick,
user_id_alt=sender_staff_id if sender_staff_id else None,
)
# Parse timestamp
create_at = getattr(message, "create_at", None)
try:
timestamp = datetime.fromtimestamp(int(create_at) / 1000, tz=timezone.utc) if create_at else datetime.now(tz=timezone.utc)
except (ValueError, OSError, TypeError):
timestamp = datetime.now(tz=timezone.utc)
event = MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=source,
message_id=msg_id,
raw_message=message,
timestamp=timestamp,
)
logger.debug("[%s] Message from %s in %s: %s",
self.name, sender_nick, chat_id[:20] if chat_id else "?", text[:50])
await self.handle_message(event)
@staticmethod
def _extract_text(message: "ChatbotMessage") -> str:
"""Extract plain text from a DingTalk chatbot message."""
text = getattr(message, "text", None) or ""
if isinstance(text, dict):
content = text.get("content", "").strip()
else:
content = str(text).strip()
# Fall back to rich text if present
if not content:
rich_text = getattr(message, "rich_text", None)
if rich_text and isinstance(rich_text, list):
parts = [item["text"] for item in rich_text
if isinstance(item, dict) and item.get("text")]
content = " ".join(parts).strip()
return content
# -- Deduplication ------------------------------------------------------
def _is_duplicate(self, msg_id: str) -> bool:
"""Check and record a message ID. Returns True if already seen."""
now = time.time()
if len(self._seen_messages) > DEDUP_MAX_SIZE:
cutoff = now - DEDUP_WINDOW_SECONDS
self._seen_messages = {k: v for k, v in self._seen_messages.items() if v > cutoff}
if msg_id in self._seen_messages:
return True
self._seen_messages[msg_id] = now
return False
# -- Outbound messaging -------------------------------------------------
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a markdown reply via DingTalk session webhook."""
metadata = metadata or {}
session_webhook = metadata.get("session_webhook") or self._session_webhooks.get(chat_id)
if not session_webhook:
return SendResult(success=False,
error="No session_webhook available. Reply must follow an incoming message.")
if not self._http_client:
return SendResult(success=False, error="HTTP client not initialized")
payload = {
"msgtype": "markdown",
"markdown": {"title": "Hermes", "text": content[:self.MAX_MESSAGE_LENGTH]},
}
try:
resp = await self._http_client.post(session_webhook, json=payload, timeout=15.0)
if resp.status_code < 300:
return SendResult(success=True, message_id=uuid.uuid4().hex[:12])
body = resp.text
logger.warning("[%s] Send failed HTTP %d: %s", self.name, resp.status_code, body[:200])
return SendResult(success=False, error=f"HTTP {resp.status_code}: {body[:200]}")
except httpx.TimeoutException:
return SendResult(success=False, error="Timeout sending message to DingTalk")
except Exception as e:
logger.error("[%s] Send error: %s", self.name, e)
return SendResult(success=False, error=str(e))
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""DingTalk does not support typing indicators."""
pass
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Return basic info about a DingTalk conversation."""
return {"name": chat_id, "type": "group" if "group" in chat_id.lower() else "dm"}
# ---------------------------------------------------------------------------
# Internal stream handler
# ---------------------------------------------------------------------------
class _IncomingHandler(ChatbotHandler if DINGTALK_STREAM_AVAILABLE else object):
"""dingtalk-stream ChatbotHandler that forwards messages to the adapter."""
def __init__(self, adapter: DingTalkAdapter, loop: asyncio.AbstractEventLoop):
if DINGTALK_STREAM_AVAILABLE:
super().__init__()
self._adapter = adapter
self._loop = loop
def process(self, message: "ChatbotMessage"):
"""Called by dingtalk-stream in its thread when a message arrives.
Schedules the async handler on the main event loop.
"""
loop = self._loop
if loop is None or loop.is_closed():
logger.error("[DingTalk] Event loop unavailable, cannot dispatch message")
return dingtalk_stream.AckMessage.STATUS_OK, "OK"
future = asyncio.run_coroutine_threadsafe(self._adapter._on_message(message), loop)
try:
future.result(timeout=60)
except Exception:
logger.exception("[DingTalk] Error processing incoming message")
return dingtalk_stream.AckMessage.STATUS_OK, "OK"

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,550 @@
"""
Email platform adapter for the Hermes gateway.
Allows users to interact with Hermes by sending emails.
Uses IMAP to receive and SMTP to send messages.
Environment variables:
EMAIL_IMAP_HOST IMAP server host (e.g., imap.gmail.com)
EMAIL_IMAP_PORT IMAP server port (default: 993)
EMAIL_SMTP_HOST SMTP server host (e.g., smtp.gmail.com)
EMAIL_SMTP_PORT SMTP server port (default: 587)
EMAIL_ADDRESS Email address for the agent
EMAIL_PASSWORD Email password or app-specific password
EMAIL_POLL_INTERVAL Seconds between mailbox checks (default: 15)
EMAIL_ALLOWED_USERS Comma-separated list of allowed sender addresses
"""
import asyncio
import email as email_lib
import imaplib
import logging
import os
import re
import smtplib
import ssl
import uuid
from datetime import datetime
from email.header import decode_header
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.base import MIMEBase
from email import encoders
from pathlib import Path
from typing import Any, Dict, List, Optional
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
cache_document_from_bytes,
cache_image_from_bytes,
)
from gateway.config import Platform, PlatformConfig
logger = logging.getLogger(__name__)
# Gmail-safe max length per email body
MAX_MESSAGE_LENGTH = 50_000
# Supported image extensions for inline detection
_IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".gif", ".webp"}
def check_email_requirements() -> bool:
"""Check if email platform dependencies are available."""
addr = os.getenv("EMAIL_ADDRESS")
pwd = os.getenv("EMAIL_PASSWORD")
imap = os.getenv("EMAIL_IMAP_HOST")
smtp = os.getenv("EMAIL_SMTP_HOST")
if not all([addr, pwd, imap, smtp]):
return False
return True
def _decode_header_value(raw: str) -> str:
"""Decode an RFC 2047 encoded email header into a plain string."""
parts = decode_header(raw)
decoded = []
for part, charset in parts:
if isinstance(part, bytes):
decoded.append(part.decode(charset or "utf-8", errors="replace"))
else:
decoded.append(part)
return " ".join(decoded)
def _extract_text_body(msg: email_lib.message.Message) -> str:
"""Extract the plain-text body from a potentially multipart email."""
if msg.is_multipart():
for part in msg.walk():
content_type = part.get_content_type()
disposition = str(part.get("Content-Disposition", ""))
# Skip attachments
if "attachment" in disposition:
continue
if content_type == "text/plain":
payload = part.get_payload(decode=True)
if payload:
charset = part.get_content_charset() or "utf-8"
return payload.decode(charset, errors="replace")
# Fallback: try text/html and strip tags
for part in msg.walk():
content_type = part.get_content_type()
disposition = str(part.get("Content-Disposition", ""))
if "attachment" in disposition:
continue
if content_type == "text/html":
payload = part.get_payload(decode=True)
if payload:
charset = part.get_content_charset() or "utf-8"
html = payload.decode(charset, errors="replace")
return _strip_html(html)
return ""
else:
payload = msg.get_payload(decode=True)
if payload:
charset = msg.get_content_charset() or "utf-8"
text = payload.decode(charset, errors="replace")
if msg.get_content_type() == "text/html":
return _strip_html(text)
return text
return ""
def _strip_html(html: str) -> str:
"""Naive HTML tag stripper for fallback text extraction."""
text = re.sub(r"<br\s*/?>", "\n", html, flags=re.IGNORECASE)
text = re.sub(r"<p[^>]*>", "\n", text, flags=re.IGNORECASE)
text = re.sub(r"</p>", "\n", text, flags=re.IGNORECASE)
text = re.sub(r"<[^>]+>", "", text)
text = re.sub(r"&nbsp;", " ", text)
text = re.sub(r"&amp;", "&", text)
text = re.sub(r"&lt;", "<", text)
text = re.sub(r"&gt;", ">", text)
text = re.sub(r"\n{3,}", "\n\n", text)
return text.strip()
def _extract_email_address(raw: str) -> str:
"""Extract bare email address from 'Name <addr>' format."""
match = re.search(r"<([^>]+)>", raw)
if match:
return match.group(1).strip().lower()
return raw.strip().lower()
def _extract_attachments(
msg: email_lib.message.Message,
skip_attachments: bool = False,
) -> List[Dict[str, Any]]:
"""Extract attachment metadata and cache files locally.
When *skip_attachments* is True, all attachment/inline parts are ignored
(useful for malware protection or bandwidth savings).
"""
attachments = []
if not msg.is_multipart():
return attachments
for part in msg.walk():
disposition = str(part.get("Content-Disposition", ""))
if skip_attachments and ("attachment" in disposition or "inline" in disposition):
continue
if "attachment" not in disposition and "inline" not in disposition:
continue
# Skip text/plain and text/html body parts
content_type = part.get_content_type()
if content_type in ("text/plain", "text/html") and "attachment" not in disposition:
continue
filename = part.get_filename()
if filename:
filename = _decode_header_value(filename)
else:
ext = part.get_content_subtype() or "bin"
filename = f"attachment.{ext}"
payload = part.get_payload(decode=True)
if not payload:
continue
ext = Path(filename).suffix.lower()
if ext in _IMAGE_EXTS:
cached_path = cache_image_from_bytes(payload, ext)
attachments.append({
"path": cached_path,
"filename": filename,
"type": "image",
"media_type": content_type,
})
else:
cached_path = cache_document_from_bytes(payload, filename)
attachments.append({
"path": cached_path,
"filename": filename,
"type": "document",
"media_type": content_type,
})
return attachments
class EmailAdapter(BasePlatformAdapter):
"""Email gateway adapter using IMAP (receive) and SMTP (send)."""
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.EMAIL)
self._address = os.getenv("EMAIL_ADDRESS", "")
self._password = os.getenv("EMAIL_PASSWORD", "")
self._imap_host = os.getenv("EMAIL_IMAP_HOST", "")
self._imap_port = int(os.getenv("EMAIL_IMAP_PORT", "993"))
self._smtp_host = os.getenv("EMAIL_SMTP_HOST", "")
self._smtp_port = int(os.getenv("EMAIL_SMTP_PORT", "587"))
self._poll_interval = int(os.getenv("EMAIL_POLL_INTERVAL", "15"))
# Skip attachments — configured via config.yaml:
# platforms:
# email:
# skip_attachments: true
extra = config.extra or {}
self._skip_attachments = extra.get("skip_attachments", False)
# Track message IDs we've already processed to avoid duplicates
self._seen_uids: set = set()
self._poll_task: Optional[asyncio.Task] = None
# Map chat_id (sender email) -> last subject + message-id for threading
self._thread_context: Dict[str, Dict[str, str]] = {}
logger.info("[Email] Adapter initialized for %s", self._address)
async def connect(self) -> bool:
"""Connect to the IMAP server and start polling for new messages."""
try:
# Test IMAP connection
imap = imaplib.IMAP4_SSL(self._imap_host, self._imap_port)
imap.login(self._address, self._password)
# Mark all existing messages as seen so we only process new ones
imap.select("INBOX")
status, data = imap.uid("search", None, "ALL")
if status == "OK" and data and data[0]:
for uid in data[0].split():
self._seen_uids.add(uid)
imap.logout()
logger.info("[Email] IMAP connection test passed. %d existing messages skipped.", len(self._seen_uids))
except Exception as e:
logger.error("[Email] IMAP connection failed: %s", e)
return False
try:
# Test SMTP connection
smtp = smtplib.SMTP(self._smtp_host, self._smtp_port)
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.quit()
logger.info("[Email] SMTP connection test passed.")
except Exception as e:
logger.error("[Email] SMTP connection failed: %s", e)
return False
self._running = True
self._poll_task = asyncio.create_task(self._poll_loop())
print(f"[Email] Connected as {self._address}")
return True
async def disconnect(self) -> None:
"""Stop polling and disconnect."""
self._running = False
if self._poll_task:
self._poll_task.cancel()
try:
await self._poll_task
except asyncio.CancelledError:
pass
self._poll_task = None
logger.info("[Email] Disconnected.")
async def _poll_loop(self) -> None:
"""Poll IMAP for new messages at regular intervals."""
while self._running:
try:
await self._check_inbox()
except asyncio.CancelledError:
break
except Exception as e:
logger.error("[Email] Poll error: %s", e)
await asyncio.sleep(self._poll_interval)
async def _check_inbox(self) -> None:
"""Check INBOX for unseen messages and dispatch them."""
# Run IMAP operations in a thread to avoid blocking the event loop
loop = asyncio.get_running_loop()
messages = await loop.run_in_executor(None, self._fetch_new_messages)
for msg_data in messages:
await self._dispatch_message(msg_data)
def _fetch_new_messages(self) -> List[Dict[str, Any]]:
"""Fetch new (unseen) messages from IMAP. Runs in executor thread."""
results = []
try:
imap = imaplib.IMAP4_SSL(self._imap_host, self._imap_port)
imap.login(self._address, self._password)
imap.select("INBOX")
status, data = imap.uid("search", None, "UNSEEN")
if status != "OK" or not data or not data[0]:
imap.logout()
return results
for uid in data[0].split():
if uid in self._seen_uids:
continue
self._seen_uids.add(uid)
status, msg_data = imap.uid("fetch", uid, "(RFC822)")
if status != "OK":
continue
raw_email = msg_data[0][1]
msg = email_lib.message_from_bytes(raw_email)
sender_raw = msg.get("From", "")
sender_addr = _extract_email_address(sender_raw)
sender_name = _decode_header_value(sender_raw)
# Remove email from name if present
if "<" in sender_name:
sender_name = sender_name.split("<")[0].strip().strip('"')
subject = _decode_header_value(msg.get("Subject", "(no subject)"))
message_id = msg.get("Message-ID", "")
in_reply_to = msg.get("In-Reply-To", "")
body = _extract_text_body(msg)
attachments = _extract_attachments(msg, skip_attachments=self._skip_attachments)
results.append({
"uid": uid,
"sender_addr": sender_addr,
"sender_name": sender_name,
"subject": subject,
"message_id": message_id,
"in_reply_to": in_reply_to,
"body": body,
"attachments": attachments,
"date": msg.get("Date", ""),
})
imap.logout()
except Exception as e:
logger.error("[Email] IMAP fetch error: %s", e)
return results
async def _dispatch_message(self, msg_data: Dict[str, Any]) -> None:
"""Convert a fetched email into a MessageEvent and dispatch it."""
sender_addr = msg_data["sender_addr"]
# Skip self-messages
if sender_addr == self._address.lower():
return
subject = msg_data["subject"]
body = msg_data["body"].strip()
attachments = msg_data["attachments"]
# Build message text: include subject as context
text = body
if subject and not subject.startswith("Re:"):
text = f"[Subject: {subject}]\n\n{body}"
# Determine message type and media
media_urls = []
media_types = []
msg_type = MessageType.TEXT
for att in attachments:
media_urls.append(att["path"])
media_types.append(att["media_type"])
if att["type"] == "image":
msg_type = MessageType.PHOTO
# Store thread context for reply threading
self._thread_context[sender_addr] = {
"subject": subject,
"message_id": msg_data["message_id"],
}
source = self.build_source(
chat_id=sender_addr,
chat_name=msg_data["sender_name"] or sender_addr,
chat_type="dm",
user_id=sender_addr,
user_name=msg_data["sender_name"] or sender_addr,
)
event = MessageEvent(
text=text or "(empty email)",
message_type=msg_type,
source=source,
message_id=msg_data["message_id"],
media_urls=media_urls,
media_types=media_types,
reply_to_message_id=msg_data["in_reply_to"] or None,
)
logger.info("[Email] New message from %s: %s", sender_addr, subject)
await self.handle_message(event)
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an email reply to the given address."""
try:
loop = asyncio.get_running_loop()
message_id = await loop.run_in_executor(
None, self._send_email, chat_id, content, reply_to
)
return SendResult(success=True, message_id=message_id)
except Exception as e:
logger.error("[Email] Send failed to %s: %s", chat_id, e)
return SendResult(success=False, error=str(e))
def _send_email(
self,
to_addr: str,
body: str,
reply_to_msg_id: Optional[str] = None,
) -> str:
"""Send an email via SMTP. Runs in executor thread."""
msg = MIMEMultipart()
msg["From"] = self._address
msg["To"] = to_addr
# Thread context for reply
ctx = self._thread_context.get(to_addr, {})
subject = ctx.get("subject", "Hermes Agent")
if not subject.startswith("Re:"):
subject = f"Re: {subject}"
msg["Subject"] = subject
# Threading headers
original_msg_id = reply_to_msg_id or ctx.get("message_id")
if original_msg_id:
msg["In-Reply-To"] = original_msg_id
msg["References"] = original_msg_id
msg_id = f"<hermes-{uuid.uuid4().hex[:12]}@{self._address.split('@')[1]}>"
msg["Message-ID"] = msg_id
msg.attach(MIMEText(body, "plain", "utf-8"))
smtp = smtplib.SMTP(self._smtp_host, self._smtp_port)
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.send_message(msg)
smtp.quit()
logger.info("[Email] Sent reply to %s (subject: %s)", to_addr, subject)
return msg_id
async def send_typing(self, chat_id: str, metadata: Optional[Dict[str, Any]] = None) -> None:
"""Email has no typing indicator — no-op."""
pass
async def send_image(
self,
chat_id: str,
image_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send an image URL as part of an email body."""
text = caption or ""
text += f"\n\nImage: {image_url}"
return await self.send(chat_id, text.strip(), reply_to)
async def send_document(
self,
chat_id: str,
file_path: str,
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a file as an email attachment."""
try:
loop = asyncio.get_running_loop()
message_id = await loop.run_in_executor(
None,
self._send_email_with_attachment,
chat_id,
caption or "",
file_path,
file_name,
)
return SendResult(success=True, message_id=message_id)
except Exception as e:
logger.error("[Email] Send document failed: %s", e)
return SendResult(success=False, error=str(e))
def _send_email_with_attachment(
self,
to_addr: str,
body: str,
file_path: str,
file_name: Optional[str] = None,
) -> str:
"""Send an email with a file attachment via SMTP."""
msg = MIMEMultipart()
msg["From"] = self._address
msg["To"] = to_addr
ctx = self._thread_context.get(to_addr, {})
subject = ctx.get("subject", "Hermes Agent")
if not subject.startswith("Re:"):
subject = f"Re: {subject}"
msg["Subject"] = subject
original_msg_id = ctx.get("message_id")
if original_msg_id:
msg["In-Reply-To"] = original_msg_id
msg["References"] = original_msg_id
msg_id = f"<hermes-{uuid.uuid4().hex[:12]}@{self._address.split('@')[1]}>"
msg["Message-ID"] = msg_id
if body:
msg.attach(MIMEText(body, "plain", "utf-8"))
# Attach file
p = Path(file_path)
fname = file_name or p.name
with open(p, "rb") as f:
part = MIMEBase("application", "octet-stream")
part.set_payload(f.read())
encoders.encode_base64(part)
part.add_header("Content-Disposition", f"attachment; filename={fname}")
msg.attach(part)
smtp = smtplib.SMTP(self._smtp_host, self._smtp_port)
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.send_message(msg)
smtp.quit()
return msg_id
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Return basic info about the email chat."""
ctx = self._thread_context.get(chat_id, {})
return {
"name": chat_id,
"type": "dm",
"chat_id": chat_id,
"subject": ctx.get("subject", ""),
}

View file

@ -0,0 +1,446 @@
"""
Home Assistant platform adapter.
Connects to the HA WebSocket API for real-time event monitoring.
State-change events are converted to MessageEvent objects and forwarded
to the agent for processing. Outbound messages are delivered as HA
persistent notifications.
Requires:
- aiohttp (already in messaging extras)
- HASS_TOKEN env var (Long-Lived Access Token)
- HASS_URL env var (default: http://homeassistant.local:8123)
"""
import asyncio
import json
import logging
import os
import time
import uuid
from datetime import datetime
from typing import Any, Dict, List, Optional, Set
try:
import aiohttp
AIOHTTP_AVAILABLE = True
except ImportError:
AIOHTTP_AVAILABLE = False
aiohttp = None # type: ignore[assignment]
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
)
logger = logging.getLogger(__name__)
def check_ha_requirements() -> bool:
"""Check if Home Assistant dependencies are available and configured."""
if not AIOHTTP_AVAILABLE:
return False
if not os.getenv("HASS_TOKEN"):
return False
return True
class HomeAssistantAdapter(BasePlatformAdapter):
"""
Home Assistant WebSocket adapter.
Subscribes to ``state_changed`` events and forwards them as
MessageEvent objects. Supports domain/entity filtering and
per-entity cooldowns to avoid event floods.
"""
MAX_MESSAGE_LENGTH = 4096
# Reconnection backoff schedule (seconds)
_BACKOFF_STEPS = [5, 10, 30, 60]
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.HOMEASSISTANT)
# Connection state
self._session: Optional["aiohttp.ClientSession"] = None
self._ws: Optional["aiohttp.ClientWebSocketResponse"] = None
self._rest_session: Optional["aiohttp.ClientSession"] = None
self._listen_task: Optional[asyncio.Task] = None
self._msg_id: int = 0
# Configuration from extra
extra = config.extra or {}
token = config.token or os.getenv("HASS_TOKEN", "")
url = extra.get("url") or os.getenv("HASS_URL", "http://homeassistant.local:8123")
self._hass_url: str = url.rstrip("/")
self._hass_token: str = token
# Event filtering
self._watch_domains: Set[str] = set(extra.get("watch_domains", []))
self._watch_entities: Set[str] = set(extra.get("watch_entities", []))
self._ignore_entities: Set[str] = set(extra.get("ignore_entities", []))
self._watch_all: bool = bool(extra.get("watch_all", False))
self._cooldown_seconds: int = int(extra.get("cooldown_seconds", 30))
# Cooldown tracking: entity_id -> last_event_timestamp
self._last_event_time: Dict[str, float] = {}
def _next_id(self) -> int:
"""Return the next WebSocket message ID."""
self._msg_id += 1
return self._msg_id
# ------------------------------------------------------------------
# Connection lifecycle
# ------------------------------------------------------------------
async def connect(self) -> bool:
"""Connect to HA WebSocket API and subscribe to events."""
if not AIOHTTP_AVAILABLE:
logger.warning("[%s] aiohttp not installed. Run: pip install aiohttp", self.name)
return False
if not self._hass_token:
logger.warning("[%s] No HASS_TOKEN configured", self.name)
return False
try:
success = await self._ws_connect()
if not success:
return False
# Dedicated REST session for send() calls
self._rest_session = aiohttp.ClientSession()
# Warn if no event filters are configured
if not self._watch_domains and not self._watch_entities and not self._watch_all:
logger.warning(
"[%s] No watch_domains, watch_entities, or watch_all configured. "
"All state_changed events will be dropped. Configure filters in "
"your HA platform config to receive events.",
self.name,
)
# Start background listener
self._listen_task = asyncio.create_task(self._listen_loop())
self._running = True
logger.info("[%s] Connected to %s", self.name, self._hass_url)
return True
except Exception as e:
logger.error("[%s] Failed to connect: %s", self.name, e)
return False
async def _ws_connect(self) -> bool:
"""Establish WebSocket connection and authenticate."""
ws_url = self._hass_url.replace("http://", "ws://").replace("https://", "wss://")
ws_url = f"{ws_url}/api/websocket"
self._session = aiohttp.ClientSession()
self._ws = await self._session.ws_connect(ws_url, heartbeat=30)
# Step 1: Receive auth_required
msg = await self._ws.receive_json()
if msg.get("type") != "auth_required":
logger.error("Expected auth_required, got: %s", msg.get("type"))
await self._cleanup_ws()
return False
# Step 2: Send auth
await self._ws.send_json({
"type": "auth",
"access_token": self._hass_token,
})
# Step 3: Wait for auth_ok
msg = await self._ws.receive_json()
if msg.get("type") != "auth_ok":
logger.error("Auth failed: %s", msg)
await self._cleanup_ws()
return False
# Step 4: Subscribe to state_changed events
sub_id = self._next_id()
await self._ws.send_json({
"id": sub_id,
"type": "subscribe_events",
"event_type": "state_changed",
})
# Verify subscription acknowledgement
msg = await self._ws.receive_json()
if not msg.get("success"):
logger.error("Failed to subscribe to events: %s", msg)
await self._cleanup_ws()
return False
return True
async def _cleanup_ws(self) -> None:
"""Close WebSocket and session."""
if self._ws and not self._ws.closed:
await self._ws.close()
self._ws = None
if self._session and not self._session.closed:
await self._session.close()
self._session = None
async def disconnect(self) -> None:
"""Disconnect from Home Assistant."""
self._running = False
if self._listen_task:
self._listen_task.cancel()
try:
await self._listen_task
except asyncio.CancelledError:
pass
self._listen_task = None
await self._cleanup_ws()
if self._rest_session and not self._rest_session.closed:
await self._rest_session.close()
self._rest_session = None
logger.info("[%s] Disconnected", self.name)
# ------------------------------------------------------------------
# Event listener
# ------------------------------------------------------------------
async def _listen_loop(self) -> None:
"""Main event loop with automatic reconnection."""
backoff_idx = 0
while self._running:
try:
await self._read_events()
except asyncio.CancelledError:
return
except Exception as e:
logger.warning("[%s] WebSocket error: %s", self.name, e)
if not self._running:
return
# Reconnect with backoff
delay = self._BACKOFF_STEPS[min(backoff_idx, len(self._BACKOFF_STEPS) - 1)]
logger.info("[%s] Reconnecting in %ds...", self.name, delay)
await asyncio.sleep(delay)
backoff_idx += 1
try:
await self._cleanup_ws()
success = await self._ws_connect()
if success:
backoff_idx = 0 # Reset on successful reconnect
logger.info("[%s] Reconnected", self.name)
except Exception as e:
logger.warning("[%s] Reconnection failed: %s", self.name, e)
async def _read_events(self) -> None:
"""Read events from WebSocket until disconnected."""
if self._ws is None or self._ws.closed:
return
async for ws_msg in self._ws:
if ws_msg.type == aiohttp.WSMsgType.TEXT:
try:
data = json.loads(ws_msg.data)
if data.get("type") == "event":
await self._handle_ha_event(data.get("event", {}))
except json.JSONDecodeError:
logger.debug("Invalid JSON from HA WS: %s", ws_msg.data[:200])
elif ws_msg.type in (aiohttp.WSMsgType.CLOSED, aiohttp.WSMsgType.ERROR):
break
async def _handle_ha_event(self, event: Dict[str, Any]) -> None:
"""Process a state_changed event from Home Assistant."""
event_data = event.get("data", {})
entity_id: str = event_data.get("entity_id", "")
if not entity_id:
return
# Apply ignore filter
if entity_id in self._ignore_entities:
return
# Apply domain/entity watch filters (closed by default — require
# explicit watch_domains, watch_entities, or watch_all to forward)
domain = entity_id.split(".")[0] if "." in entity_id else ""
if self._watch_domains or self._watch_entities:
domain_match = domain in self._watch_domains if self._watch_domains else False
entity_match = entity_id in self._watch_entities if self._watch_entities else False
if not domain_match and not entity_match:
return
elif not self._watch_all:
# No filters configured and watch_all is off — drop the event
return
# Apply cooldown
now = time.time()
last = self._last_event_time.get(entity_id, 0)
if (now - last) < self._cooldown_seconds:
return
self._last_event_time[entity_id] = now
# Build human-readable message
old_state = event_data.get("old_state", {})
new_state = event_data.get("new_state", {})
message = self._format_state_change(entity_id, old_state, new_state)
if not message:
return
# Build MessageEvent and forward to handler
source = self.build_source(
chat_id="ha_events",
chat_name="Home Assistant Events",
chat_type="channel",
user_id="homeassistant",
user_name="Home Assistant",
)
msg_event = MessageEvent(
text=message,
message_type=MessageType.TEXT,
source=source,
message_id=f"ha_{entity_id}_{int(now)}",
timestamp=datetime.now(),
)
await self.handle_message(msg_event)
@staticmethod
def _format_state_change(
entity_id: str,
old_state: Dict[str, Any],
new_state: Dict[str, Any],
) -> Optional[str]:
"""Convert a state_changed event into a human-readable description."""
if not new_state:
return None
old_val = old_state.get("state", "unknown") if old_state else "unknown"
new_val = new_state.get("state", "unknown")
# Skip if state didn't actually change
if old_val == new_val:
return None
friendly_name = new_state.get("attributes", {}).get("friendly_name", entity_id)
domain = entity_id.split(".")[0] if "." in entity_id else ""
# Domain-specific formatting
if domain == "climate":
attrs = new_state.get("attributes", {})
temp = attrs.get("current_temperature", "?")
target = attrs.get("temperature", "?")
return (
f"[Home Assistant] {friendly_name}: HVAC mode changed from "
f"'{old_val}' to '{new_val}' (current: {temp}, target: {target})"
)
if domain == "sensor":
unit = new_state.get("attributes", {}).get("unit_of_measurement", "")
return (
f"[Home Assistant] {friendly_name}: changed from "
f"{old_val}{unit} to {new_val}{unit}"
)
if domain == "binary_sensor":
return (
f"[Home Assistant] {friendly_name}: "
f"{'triggered' if new_val == 'on' else 'cleared'} "
f"(was {'triggered' if old_val == 'on' else 'cleared'})"
)
if domain in ("light", "switch", "fan"):
return (
f"[Home Assistant] {friendly_name}: turned "
f"{'on' if new_val == 'on' else 'off'}"
)
if domain == "alarm_control_panel":
return (
f"[Home Assistant] {friendly_name}: alarm state changed from "
f"'{old_val}' to '{new_val}'"
)
# Generic fallback
return (
f"[Home Assistant] {friendly_name} ({entity_id}): "
f"changed from '{old_val}' to '{new_val}'"
)
# ------------------------------------------------------------------
# Outbound messaging
# ------------------------------------------------------------------
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a notification via HA REST API (persistent_notification.create).
Uses the REST API instead of WebSocket to avoid a race condition
with the event listener loop that reads from the same WS connection.
"""
url = f"{self._hass_url}/api/services/persistent_notification/create"
headers = {
"Authorization": f"Bearer {self._hass_token}",
"Content-Type": "application/json",
}
payload = {
"title": "Hermes Agent",
"message": content[:self.MAX_MESSAGE_LENGTH],
}
try:
if self._rest_session:
async with self._rest_session.post(
url,
headers=headers,
json=payload,
timeout=aiohttp.ClientTimeout(total=10),
) as resp:
if resp.status < 300:
return SendResult(success=True, message_id=uuid.uuid4().hex[:12])
else:
body = await resp.text()
return SendResult(success=False, error=f"HTTP {resp.status}: {body}")
else:
async with aiohttp.ClientSession() as session:
async with session.post(
url,
headers=headers,
json=payload,
timeout=aiohttp.ClientTimeout(total=10),
) as resp:
if resp.status < 300:
return SendResult(success=True, message_id=uuid.uuid4().hex[:12])
else:
body = await resp.text()
return SendResult(success=False, error=f"HTTP {resp.status}: {body}")
except asyncio.TimeoutError:
return SendResult(success=False, error="Timeout sending notification to HA")
except Exception as e:
return SendResult(success=False, error=str(e))
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""No typing indicator for Home Assistant."""
pass
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Return basic info about the HA event channel."""
return {
"name": "Home Assistant Events",
"type": "channel",
"url": self._hass_url,
}

View file

@ -0,0 +1,895 @@
"""Matrix gateway adapter.
Connects to any Matrix homeserver (self-hosted or matrix.org) via the
matrix-nio Python SDK. Supports optional end-to-end encryption (E2EE)
when installed with ``pip install "matrix-nio[e2e]"``.
Environment variables:
MATRIX_HOMESERVER Homeserver URL (e.g. https://matrix.example.org)
MATRIX_ACCESS_TOKEN Access token (preferred auth method)
MATRIX_USER_ID Full user ID (@bot:server) required for password login
MATRIX_PASSWORD Password (alternative to access token)
MATRIX_ENCRYPTION Set "true" to enable E2EE
MATRIX_ALLOWED_USERS Comma-separated Matrix user IDs (@user:server)
MATRIX_HOME_ROOM Room ID for cron/notification delivery
"""
from __future__ import annotations
import asyncio
import json
import logging
import mimetypes
import os
import re
import time
from pathlib import Path
from typing import Any, Dict, List, Optional, Set
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
)
logger = logging.getLogger(__name__)
# Matrix message size limit (4000 chars practical, spec has no hard limit
# but clients render poorly above this).
MAX_MESSAGE_LENGTH = 4000
# Store directory for E2EE keys and sync state.
_STORE_DIR = Path.home() / ".hermes" / "matrix" / "store"
# Grace period: ignore messages older than this many seconds before startup.
_STARTUP_GRACE_SECONDS = 5
def check_matrix_requirements() -> bool:
"""Return True if the Matrix adapter can be used."""
token = os.getenv("MATRIX_ACCESS_TOKEN", "")
password = os.getenv("MATRIX_PASSWORD", "")
homeserver = os.getenv("MATRIX_HOMESERVER", "")
if not token and not password:
logger.debug("Matrix: neither MATRIX_ACCESS_TOKEN nor MATRIX_PASSWORD set")
return False
if not homeserver:
logger.warning("Matrix: MATRIX_HOMESERVER not set")
return False
try:
import nio # noqa: F401
return True
except ImportError:
logger.warning(
"Matrix: matrix-nio not installed. "
"Run: pip install 'matrix-nio[e2e]'"
)
return False
class MatrixAdapter(BasePlatformAdapter):
"""Gateway adapter for Matrix (any homeserver)."""
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.MATRIX)
self._homeserver: str = (
config.extra.get("homeserver", "")
or os.getenv("MATRIX_HOMESERVER", "")
).rstrip("/")
self._access_token: str = config.token or os.getenv("MATRIX_ACCESS_TOKEN", "")
self._user_id: str = (
config.extra.get("user_id", "")
or os.getenv("MATRIX_USER_ID", "")
)
self._password: str = (
config.extra.get("password", "")
or os.getenv("MATRIX_PASSWORD", "")
)
self._encryption: bool = config.extra.get(
"encryption",
os.getenv("MATRIX_ENCRYPTION", "").lower() in ("true", "1", "yes"),
)
self._client: Any = None # nio.AsyncClient
self._sync_task: Optional[asyncio.Task] = None
self._closing = False
self._startup_ts: float = 0.0
# Cache: room_id → bool (is DM)
self._dm_rooms: Dict[str, bool] = {}
# Set of room IDs we've joined
self._joined_rooms: Set[str] = set()
# Event deduplication (bounded deque keeps newest entries)
from collections import deque
self._processed_events: deque = deque(maxlen=1000)
self._processed_events_set: set = set()
def _is_duplicate_event(self, event_id) -> bool:
"""Return True if this event was already processed. Tracks the ID otherwise."""
if not event_id:
return False
if event_id in self._processed_events_set:
return True
if len(self._processed_events) == self._processed_events.maxlen:
evicted = self._processed_events[0]
self._processed_events_set.discard(evicted)
self._processed_events.append(event_id)
self._processed_events_set.add(event_id)
return False
# ------------------------------------------------------------------
# Required overrides
# ------------------------------------------------------------------
async def connect(self) -> bool:
"""Connect to the Matrix homeserver and start syncing."""
import nio
if not self._homeserver:
logger.error("Matrix: homeserver URL not configured")
return False
# Determine store path and ensure it exists.
store_path = str(_STORE_DIR)
_STORE_DIR.mkdir(parents=True, exist_ok=True)
# Create the client.
if self._encryption:
try:
client = nio.AsyncClient(
self._homeserver,
self._user_id or "",
store_path=store_path,
)
logger.info("Matrix: E2EE enabled (store: %s)", store_path)
except Exception as exc:
logger.warning(
"Matrix: failed to create E2EE client (%s), "
"falling back to plain client. Install: "
"pip install 'matrix-nio[e2e]'",
exc,
)
client = nio.AsyncClient(self._homeserver, self._user_id or "")
else:
client = nio.AsyncClient(self._homeserver, self._user_id or "")
self._client = client
# Authenticate.
if self._access_token:
client.access_token = self._access_token
# Resolve user_id if not set.
if not self._user_id:
resp = await client.whoami()
if isinstance(resp, nio.WhoamiResponse):
self._user_id = resp.user_id
client.user_id = resp.user_id
logger.info("Matrix: authenticated as %s", self._user_id)
else:
logger.error(
"Matrix: whoami failed — check MATRIX_ACCESS_TOKEN and MATRIX_HOMESERVER"
)
await client.close()
return False
else:
client.user_id = self._user_id
logger.info("Matrix: using access token for %s", self._user_id)
elif self._password and self._user_id:
resp = await client.login(
self._password,
device_name="Hermes Agent",
)
if isinstance(resp, nio.LoginResponse):
logger.info("Matrix: logged in as %s", self._user_id)
else:
logger.error("Matrix: login failed — %s", getattr(resp, "message", resp))
await client.close()
return False
else:
logger.error("Matrix: need MATRIX_ACCESS_TOKEN or MATRIX_USER_ID + MATRIX_PASSWORD")
await client.close()
return False
# If E2EE is enabled, load the crypto store.
if self._encryption and hasattr(client, "olm"):
try:
if client.should_upload_keys:
await client.keys_upload()
logger.info("Matrix: E2EE crypto initialized")
except Exception as exc:
logger.warning("Matrix: crypto init issue: %s", exc)
# Register event callbacks.
client.add_event_callback(self._on_room_message, nio.RoomMessageText)
client.add_event_callback(self._on_room_message_media, nio.RoomMessageImage)
client.add_event_callback(self._on_room_message_media, nio.RoomMessageAudio)
client.add_event_callback(self._on_room_message_media, nio.RoomMessageVideo)
client.add_event_callback(self._on_room_message_media, nio.RoomMessageFile)
client.add_event_callback(self._on_invite, nio.InviteMemberEvent)
# If E2EE: handle encrypted events.
if self._encryption and hasattr(client, "olm"):
client.add_event_callback(
self._on_room_message, nio.MegolmEvent
)
# Initial sync to catch up, then start background sync.
self._startup_ts = time.time()
self._closing = False
# Do an initial sync to populate room state.
resp = await client.sync(timeout=10000, full_state=True)
if isinstance(resp, nio.SyncResponse):
self._joined_rooms = set(resp.rooms.join.keys())
logger.info(
"Matrix: initial sync complete, joined %d rooms",
len(self._joined_rooms),
)
# Build DM room cache from m.direct account data.
await self._refresh_dm_cache()
else:
logger.warning("Matrix: initial sync returned %s", type(resp).__name__)
# Start the sync loop.
self._sync_task = asyncio.create_task(self._sync_loop())
self._mark_connected()
return True
async def disconnect(self) -> None:
"""Disconnect from Matrix."""
self._closing = True
if self._sync_task and not self._sync_task.done():
self._sync_task.cancel()
try:
await self._sync_task
except (asyncio.CancelledError, Exception):
pass
if self._client:
await self._client.close()
self._client = None
logger.info("Matrix: disconnected")
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a message to a Matrix room."""
import nio
if not content:
return SendResult(success=True)
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, MAX_MESSAGE_LENGTH)
last_event_id = None
for chunk in chunks:
msg_content: Dict[str, Any] = {
"msgtype": "m.text",
"body": chunk,
}
# Convert markdown to HTML for rich rendering.
html = self._markdown_to_html(chunk)
if html and html != chunk:
msg_content["format"] = "org.matrix.custom.html"
msg_content["formatted_body"] = html
# Reply-to support.
if reply_to:
msg_content["m.relates_to"] = {
"m.in_reply_to": {"event_id": reply_to}
}
# Thread support: if metadata has thread_id, send as threaded reply.
thread_id = (metadata or {}).get("thread_id")
if thread_id:
relates_to = msg_content.get("m.relates_to", {})
relates_to["rel_type"] = "m.thread"
relates_to["event_id"] = thread_id
relates_to["is_falling_back"] = True
if reply_to and "m.in_reply_to" not in relates_to:
relates_to["m.in_reply_to"] = {"event_id": reply_to}
msg_content["m.relates_to"] = relates_to
resp = await self._client.room_send(
chat_id,
"m.room.message",
msg_content,
)
if isinstance(resp, nio.RoomSendResponse):
last_event_id = resp.event_id
else:
err = getattr(resp, "message", str(resp))
logger.error("Matrix: failed to send to %s: %s", chat_id, err)
return SendResult(success=False, error=err)
return SendResult(success=True, message_id=last_event_id)
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Return room name and type (dm/group)."""
name = chat_id
chat_type = "group"
if self._client:
room = self._client.rooms.get(chat_id)
if room:
name = room.display_name or room.canonical_alias or chat_id
# Use DM cache.
if self._dm_rooms.get(chat_id, False):
chat_type = "dm"
elif room.member_count == 2:
chat_type = "dm"
return {"name": name, "type": chat_type}
# ------------------------------------------------------------------
# Optional overrides
# ------------------------------------------------------------------
async def send_typing(
self, chat_id: str, metadata: Optional[Dict[str, Any]] = None
) -> None:
"""Send a typing indicator."""
if self._client:
try:
await self._client.room_typing(chat_id, typing_state=True, timeout=30000)
except Exception:
pass
async def edit_message(
self, chat_id: str, message_id: str, content: str
) -> SendResult:
"""Edit an existing message (via m.replace)."""
import nio
formatted = self.format_message(content)
msg_content: Dict[str, Any] = {
"msgtype": "m.text",
"body": f"* {formatted}",
"m.new_content": {
"msgtype": "m.text",
"body": formatted,
},
"m.relates_to": {
"rel_type": "m.replace",
"event_id": message_id,
},
}
html = self._markdown_to_html(formatted)
if html and html != formatted:
msg_content["m.new_content"]["format"] = "org.matrix.custom.html"
msg_content["m.new_content"]["formatted_body"] = html
msg_content["format"] = "org.matrix.custom.html"
msg_content["formatted_body"] = f"* {html}"
resp = await self._client.room_send(chat_id, "m.room.message", msg_content)
if isinstance(resp, nio.RoomSendResponse):
return SendResult(success=True, message_id=resp.event_id)
return SendResult(success=False, error=getattr(resp, "message", str(resp)))
async def send_image(
self,
chat_id: str,
image_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Download an image URL and upload it to Matrix."""
try:
# Try aiohttp first (always available), fall back to httpx
try:
import aiohttp as _aiohttp
async with _aiohttp.ClientSession() as http:
async with http.get(image_url, timeout=_aiohttp.ClientTimeout(total=30)) as resp:
resp.raise_for_status()
data = await resp.read()
ct = resp.content_type or "image/png"
fname = image_url.rsplit("/", 1)[-1].split("?")[0] or "image.png"
except ImportError:
import httpx
async with httpx.AsyncClient() as http:
resp = await http.get(image_url, follow_redirects=True, timeout=30)
resp.raise_for_status()
data = resp.content
ct = resp.headers.get("content-type", "image/png")
fname = image_url.rsplit("/", 1)[-1].split("?")[0] or "image.png"
except Exception as exc:
logger.warning("Matrix: failed to download image %s: %s", image_url, exc)
return await self.send(chat_id, f"{caption or ''}\n{image_url}".strip(), reply_to)
return await self._upload_and_send(chat_id, data, fname, ct, "m.image", caption, reply_to, metadata)
async def send_image_file(
self,
chat_id: str,
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload a local image file to Matrix."""
return await self._send_local_file(chat_id, image_path, "m.image", caption, reply_to, metadata=metadata)
async def send_document(
self,
chat_id: str,
file_path: str,
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload a local file as a document."""
return await self._send_local_file(chat_id, file_path, "m.file", caption, reply_to, file_name, metadata)
async def send_voice(
self,
chat_id: str,
audio_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload an audio file as a voice message."""
return await self._send_local_file(chat_id, audio_path, "m.audio", caption, reply_to, metadata=metadata)
async def send_video(
self,
chat_id: str,
video_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload a video file."""
return await self._send_local_file(chat_id, video_path, "m.video", caption, reply_to, metadata=metadata)
def format_message(self, content: str) -> str:
"""Pass-through — Matrix supports standard Markdown natively."""
# Strip image markdown; media is uploaded separately.
content = re.sub(r"!\[([^\]]*)\]\(([^)]+)\)", r"\2", content)
return content
# ------------------------------------------------------------------
# File helpers
# ------------------------------------------------------------------
async def _upload_and_send(
self,
room_id: str,
data: bytes,
filename: str,
content_type: str,
msgtype: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload bytes to Matrix and send as a media message."""
import nio
# Upload to homeserver.
resp = await self._client.upload(
data,
content_type=content_type,
filename=filename,
)
if not isinstance(resp, nio.UploadResponse):
err = getattr(resp, "message", str(resp))
logger.error("Matrix: upload failed: %s", err)
return SendResult(success=False, error=err)
mxc_url = resp.content_uri
# Build media message content.
msg_content: Dict[str, Any] = {
"msgtype": msgtype,
"body": caption or filename,
"url": mxc_url,
"info": {
"mimetype": content_type,
"size": len(data),
},
}
if reply_to:
msg_content["m.relates_to"] = {
"m.in_reply_to": {"event_id": reply_to}
}
thread_id = (metadata or {}).get("thread_id")
if thread_id:
relates_to = msg_content.get("m.relates_to", {})
relates_to["rel_type"] = "m.thread"
relates_to["event_id"] = thread_id
relates_to["is_falling_back"] = True
msg_content["m.relates_to"] = relates_to
resp2 = await self._client.room_send(room_id, "m.room.message", msg_content)
if isinstance(resp2, nio.RoomSendResponse):
return SendResult(success=True, message_id=resp2.event_id)
return SendResult(success=False, error=getattr(resp2, "message", str(resp2)))
async def _send_local_file(
self,
room_id: str,
file_path: str,
msgtype: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
file_name: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Read a local file and upload it."""
p = Path(file_path)
if not p.exists():
return await self.send(
room_id, f"{caption or ''}\n(file not found: {file_path})", reply_to
)
fname = file_name or p.name
ct = mimetypes.guess_type(fname)[0] or "application/octet-stream"
data = p.read_bytes()
return await self._upload_and_send(room_id, data, fname, ct, msgtype, caption, reply_to, metadata)
# ------------------------------------------------------------------
# Sync loop
# ------------------------------------------------------------------
async def _sync_loop(self) -> None:
"""Continuously sync with the homeserver."""
while not self._closing:
try:
await self._client.sync(timeout=30000)
except asyncio.CancelledError:
return
except Exception as exc:
if self._closing:
return
logger.warning("Matrix: sync error: %s — retrying in 5s", exc)
await asyncio.sleep(5)
# ------------------------------------------------------------------
# Event callbacks
# ------------------------------------------------------------------
async def _on_room_message(self, room: Any, event: Any) -> None:
"""Handle incoming text messages (and decrypted megolm events)."""
import nio
# Ignore own messages.
if event.sender == self._user_id:
return
# Deduplicate by event ID (nio can fire the same event more than once).
if self._is_duplicate_event(getattr(event, "event_id", None)):
return
# Startup grace: ignore old messages from initial sync.
event_ts = getattr(event, "server_timestamp", 0) / 1000.0
if event_ts and event_ts < self._startup_ts - _STARTUP_GRACE_SECONDS:
return
# Handle decrypted MegolmEvents — extract the inner event.
if isinstance(event, nio.MegolmEvent):
# Failed to decrypt.
logger.warning(
"Matrix: could not decrypt event %s in %s",
event.event_id, room.room_id,
)
return
# Skip edits (m.replace relation).
source_content = getattr(event, "source", {}).get("content", {})
relates_to = source_content.get("m.relates_to", {})
if relates_to.get("rel_type") == "m.replace":
return
body = getattr(event, "body", "") or ""
if not body:
return
# Determine chat type.
is_dm = self._dm_rooms.get(room.room_id, False)
if not is_dm and room.member_count == 2:
is_dm = True
chat_type = "dm" if is_dm else "group"
# Thread support.
thread_id = None
if relates_to.get("rel_type") == "m.thread":
thread_id = relates_to.get("event_id")
# Reply-to detection.
reply_to = None
in_reply_to = relates_to.get("m.in_reply_to", {})
if in_reply_to:
reply_to = in_reply_to.get("event_id")
# Strip reply fallback from body (Matrix prepends "> ..." lines).
if reply_to and body.startswith("> "):
lines = body.split("\n")
stripped = []
past_fallback = False
for line in lines:
if not past_fallback:
if line.startswith("> ") or line == ">":
continue
if line == "":
past_fallback = True
continue
past_fallback = True
stripped.append(line)
body = "\n".join(stripped) if stripped else body
# Message type.
msg_type = MessageType.TEXT
if body.startswith("!") or body.startswith("/"):
msg_type = MessageType.COMMAND
source = self.build_source(
chat_id=room.room_id,
chat_type=chat_type,
user_id=event.sender,
user_name=self._get_display_name(room, event.sender),
thread_id=thread_id,
)
msg_event = MessageEvent(
text=body,
message_type=msg_type,
source=source,
raw_message=getattr(event, "source", {}),
message_id=event.event_id,
reply_to_message_id=reply_to,
)
await self.handle_message(msg_event)
async def _on_room_message_media(self, room: Any, event: Any) -> None:
"""Handle incoming media messages (images, audio, video, files)."""
import nio
# Ignore own messages.
if event.sender == self._user_id:
return
# Deduplicate by event ID.
if self._is_duplicate_event(getattr(event, "event_id", None)):
return
# Startup grace.
event_ts = getattr(event, "server_timestamp", 0) / 1000.0
if event_ts and event_ts < self._startup_ts - _STARTUP_GRACE_SECONDS:
return
body = getattr(event, "body", "") or ""
url = getattr(event, "url", "")
# Convert mxc:// to HTTP URL for downstream processing.
http_url = ""
if url and url.startswith("mxc://"):
http_url = self._mxc_to_http(url)
# Determine message type from event class.
# Use the MIME type from the event's content info when available,
# falling back to category-level MIME types for downstream matching
# (gateway/run.py checks startswith("image/"), startswith("audio/"), etc.)
content_info = getattr(event, "content", {}) if isinstance(getattr(event, "content", None), dict) else {}
event_mimetype = (content_info.get("info") or {}).get("mimetype", "")
media_type = "application/octet-stream"
msg_type = MessageType.DOCUMENT
if isinstance(event, nio.RoomMessageImage):
msg_type = MessageType.PHOTO
media_type = event_mimetype or "image/png"
elif isinstance(event, nio.RoomMessageAudio):
msg_type = MessageType.AUDIO
media_type = event_mimetype or "audio/ogg"
elif isinstance(event, nio.RoomMessageVideo):
msg_type = MessageType.VIDEO
media_type = event_mimetype or "video/mp4"
elif event_mimetype:
media_type = event_mimetype
# For images, download and cache locally so vision tools can access them.
# Matrix MXC URLs require authentication, so direct URL access fails.
cached_path = None
if msg_type == MessageType.PHOTO and url:
try:
ext_map = {
"image/jpeg": ".jpg", "image/png": ".png",
"image/gif": ".gif", "image/webp": ".webp",
}
ext = ext_map.get(event_mimetype, ".jpg")
download_resp = await self._client.download(url)
if isinstance(download_resp, nio.DownloadResponse):
from gateway.platforms.base import cache_image_from_bytes
cached_path = cache_image_from_bytes(download_resp.body, ext=ext)
logger.info("[Matrix] Cached user image at %s", cached_path)
except Exception as e:
logger.warning("[Matrix] Failed to cache image: %s", e)
is_dm = self._dm_rooms.get(room.room_id, False)
if not is_dm and room.member_count == 2:
is_dm = True
chat_type = "dm" if is_dm else "group"
# Thread/reply detection.
source_content = getattr(event, "source", {}).get("content", {})
relates_to = source_content.get("m.relates_to", {})
thread_id = None
if relates_to.get("rel_type") == "m.thread":
thread_id = relates_to.get("event_id")
source = self.build_source(
chat_id=room.room_id,
chat_type=chat_type,
user_id=event.sender,
user_name=self._get_display_name(room, event.sender),
thread_id=thread_id,
)
# Use cached local path for images, HTTP URL for other media types
media_urls = [cached_path] if cached_path else ([http_url] if http_url else None)
media_types = [media_type] if media_urls else None
msg_event = MessageEvent(
text=body,
message_type=msg_type,
source=source,
raw_message=getattr(event, "source", {}),
message_id=event.event_id,
media_urls=media_urls,
media_types=media_types,
)
await self.handle_message(msg_event)
async def _on_invite(self, room: Any, event: Any) -> None:
"""Auto-join rooms when invited."""
import nio
if not isinstance(event, nio.InviteMemberEvent):
return
# Only process invites directed at us.
if event.state_key != self._user_id:
return
if event.membership != "invite":
return
logger.info(
"Matrix: invited to %s by %s — joining",
room.room_id, event.sender,
)
try:
resp = await self._client.join(room.room_id)
if isinstance(resp, nio.JoinResponse):
self._joined_rooms.add(room.room_id)
logger.info("Matrix: joined %s", room.room_id)
# Refresh DM cache since new room may be a DM.
await self._refresh_dm_cache()
else:
logger.warning(
"Matrix: failed to join %s: %s",
room.room_id, getattr(resp, "message", resp),
)
except Exception as exc:
logger.warning("Matrix: error joining %s: %s", room.room_id, exc)
# ------------------------------------------------------------------
# Helpers
# ------------------------------------------------------------------
async def _refresh_dm_cache(self) -> None:
"""Refresh the DM room cache from m.direct account data.
Tries the account_data API first, then falls back to parsing
the sync response's account_data for robustness.
"""
if not self._client:
return
dm_data: Optional[Dict] = None
# Primary: try the dedicated account data endpoint.
try:
resp = await self._client.get_account_data("m.direct")
if hasattr(resp, "content"):
dm_data = resp.content
elif isinstance(resp, dict):
dm_data = resp
except Exception as exc:
logger.debug("Matrix: get_account_data('m.direct') failed: %s — trying sync fallback", exc)
# Fallback: parse from the client's account_data store (populated by sync).
if dm_data is None:
try:
# matrix-nio stores account data events on the client object
ad = getattr(self._client, "account_data", None)
if ad and isinstance(ad, dict) and "m.direct" in ad:
event = ad["m.direct"]
if hasattr(event, "content"):
dm_data = event.content
elif isinstance(event, dict):
dm_data = event
except Exception:
pass
if dm_data is None:
return
dm_room_ids: Set[str] = set()
for user_id, rooms in dm_data.items():
if isinstance(rooms, list):
dm_room_ids.update(rooms)
self._dm_rooms = {
rid: (rid in dm_room_ids)
for rid in self._joined_rooms
}
def _get_display_name(self, room: Any, user_id: str) -> str:
"""Get a user's display name in a room, falling back to user_id."""
if room and hasattr(room, "users"):
user = room.users.get(user_id)
if user and getattr(user, "display_name", None):
return user.display_name
# Strip the @...:server format to just the localpart.
if user_id.startswith("@") and ":" in user_id:
return user_id[1:].split(":")[0]
return user_id
def _mxc_to_http(self, mxc_url: str) -> str:
"""Convert mxc://server/media_id to an HTTP download URL."""
# mxc://matrix.org/abc123 → https://matrix.org/_matrix/client/v1/media/download/matrix.org/abc123
# Uses the authenticated client endpoint (spec v1.11+) instead of the
# deprecated /_matrix/media/v3/download/ path.
if not mxc_url.startswith("mxc://"):
return mxc_url
parts = mxc_url[6:] # strip mxc://
# Use our homeserver for download (federation handles the rest).
return f"{self._homeserver}/_matrix/client/v1/media/download/{parts}"
def _markdown_to_html(self, text: str) -> str:
"""Convert Markdown to Matrix-compatible HTML.
Uses a simple conversion for common patterns. For full fidelity
a markdown-it style library could be used, but this covers the
common cases without an extra dependency.
"""
try:
import markdown
html = markdown.markdown(
text,
extensions=["fenced_code", "tables", "nl2br"],
)
# Strip wrapping <p> tags for single-paragraph messages.
if html.count("<p>") == 1:
html = html.replace("<p>", "").replace("</p>", "")
return html
except ImportError:
pass
# Minimal fallback: just handle bold, italic, code.
html = text
html = re.sub(r"\*\*(.+?)\*\*", r"<strong>\1</strong>", html)
html = re.sub(r"\*(.+?)\*", r"<em>\1</em>", html)
html = re.sub(r"`([^`]+)`", r"<code>\1</code>", html)
html = re.sub(r"\n", r"<br>", html)
return html

View file

@ -0,0 +1,682 @@
"""Mattermost gateway adapter.
Connects to a self-hosted (or cloud) Mattermost instance via its REST API
(v4) and WebSocket for real-time events. No external Mattermost library
required uses aiohttp which is already a Hermes dependency.
Environment variables:
MATTERMOST_URL Server URL (e.g. https://mm.example.com)
MATTERMOST_TOKEN Bot token or personal-access token
MATTERMOST_ALLOWED_USERS Comma-separated user IDs
MATTERMOST_HOME_CHANNEL Channel ID for cron/notification delivery
"""
from __future__ import annotations
import asyncio
import json
import logging
import os
import re
import time
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
)
logger = logging.getLogger(__name__)
# Mattermost post size limit (server default is 16383, but 4000 is the
# practical limit for readable messages — matching OpenClaw's choice).
MAX_POST_LENGTH = 4000
# Channel type codes returned by the Mattermost API.
_CHANNEL_TYPE_MAP = {
"D": "dm",
"G": "group",
"P": "group", # private channel → treat as group
"O": "channel",
}
# Reconnect parameters (exponential backoff).
_RECONNECT_BASE_DELAY = 2.0
_RECONNECT_MAX_DELAY = 60.0
_RECONNECT_JITTER = 0.2
def check_mattermost_requirements() -> bool:
"""Return True if the Mattermost adapter can be used."""
token = os.getenv("MATTERMOST_TOKEN", "")
url = os.getenv("MATTERMOST_URL", "")
if not token:
logger.debug("Mattermost: MATTERMOST_TOKEN not set")
return False
if not url:
logger.warning("Mattermost: MATTERMOST_URL not set")
return False
try:
import aiohttp # noqa: F401
return True
except ImportError:
logger.warning("Mattermost: aiohttp not installed")
return False
class MattermostAdapter(BasePlatformAdapter):
"""Gateway adapter for Mattermost (self-hosted or cloud)."""
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.MATTERMOST)
self._base_url: str = (
config.extra.get("url", "")
or os.getenv("MATTERMOST_URL", "")
).rstrip("/")
self._token: str = config.token or os.getenv("MATTERMOST_TOKEN", "")
self._bot_user_id: str = ""
self._bot_username: str = ""
# aiohttp session + websocket handle
self._session: Any = None # aiohttp.ClientSession
self._ws: Any = None # aiohttp.ClientWebSocketResponse
self._ws_task: Optional[asyncio.Task] = None
self._reconnect_task: Optional[asyncio.Task] = None
self._closing = False
# Reply mode: "thread" to nest replies, "off" for flat messages.
self._reply_mode: str = (
config.extra.get("reply_mode", "")
or os.getenv("MATTERMOST_REPLY_MODE", "off")
).lower()
# Dedup cache: post_id → timestamp (prevent reprocessing)
self._seen_posts: Dict[str, float] = {}
self._SEEN_MAX = 2000
self._SEEN_TTL = 300 # 5 minutes
# ------------------------------------------------------------------
# HTTP helpers
# ------------------------------------------------------------------
def _headers(self) -> Dict[str, str]:
return {
"Authorization": f"Bearer {self._token}",
"Content-Type": "application/json",
}
async def _api_get(self, path: str) -> Dict[str, Any]:
"""GET /api/v4/{path}."""
import aiohttp
url = f"{self._base_url}/api/v4/{path.lstrip('/')}"
try:
async with self._session.get(url, headers=self._headers()) as resp:
if resp.status >= 400:
body = await resp.text()
logger.error("MM API GET %s%s: %s", path, resp.status, body[:200])
return {}
return await resp.json()
except aiohttp.ClientError as exc:
logger.error("MM API GET %s network error: %s", path, exc)
return {}
async def _api_post(
self, path: str, payload: Dict[str, Any]
) -> Dict[str, Any]:
"""POST /api/v4/{path} with JSON body."""
import aiohttp
url = f"{self._base_url}/api/v4/{path.lstrip('/')}"
try:
async with self._session.post(
url, headers=self._headers(), json=payload
) as resp:
if resp.status >= 400:
body = await resp.text()
logger.error("MM API POST %s%s: %s", path, resp.status, body[:200])
return {}
return await resp.json()
except aiohttp.ClientError as exc:
logger.error("MM API POST %s network error: %s", path, exc)
return {}
async def _api_put(
self, path: str, payload: Dict[str, Any]
) -> Dict[str, Any]:
"""PUT /api/v4/{path} with JSON body."""
import aiohttp
url = f"{self._base_url}/api/v4/{path.lstrip('/')}"
try:
async with self._session.put(
url, headers=self._headers(), json=payload
) as resp:
if resp.status >= 400:
body = await resp.text()
logger.error("MM API PUT %s%s: %s", path, resp.status, body[:200])
return {}
return await resp.json()
except aiohttp.ClientError as exc:
logger.error("MM API PUT %s network error: %s", path, exc)
return {}
async def _upload_file(
self, channel_id: str, file_data: bytes, filename: str, content_type: str = "application/octet-stream"
) -> Optional[str]:
"""Upload a file and return its file ID, or None on failure."""
import aiohttp
url = f"{self._base_url}/api/v4/files"
form = aiohttp.FormData()
form.add_field("channel_id", channel_id)
form.add_field(
"files",
file_data,
filename=filename,
content_type=content_type,
)
headers = {"Authorization": f"Bearer {self._token}"}
async with self._session.post(url, headers=headers, data=form) as resp:
if resp.status >= 400:
body = await resp.text()
logger.error("MM file upload → %s: %s", resp.status, body[:200])
return None
data = await resp.json()
infos = data.get("file_infos", [])
return infos[0]["id"] if infos else None
# ------------------------------------------------------------------
# Required overrides
# ------------------------------------------------------------------
async def connect(self) -> bool:
"""Connect to Mattermost and start the WebSocket listener."""
import aiohttp
if not self._base_url or not self._token:
logger.error("Mattermost: URL or token not configured")
return False
self._session = aiohttp.ClientSession()
self._closing = False
# Verify credentials and fetch bot identity.
me = await self._api_get("users/me")
if not me or "id" not in me:
logger.error("Mattermost: failed to authenticate — check MATTERMOST_TOKEN and MATTERMOST_URL")
await self._session.close()
return False
self._bot_user_id = me["id"]
self._bot_username = me.get("username", "")
logger.info(
"Mattermost: authenticated as @%s (%s) on %s",
self._bot_username,
self._bot_user_id,
self._base_url,
)
# Start WebSocket in background.
self._ws_task = asyncio.create_task(self._ws_loop())
self._mark_connected()
return True
async def disconnect(self) -> None:
"""Disconnect from Mattermost."""
self._closing = True
if self._ws_task and not self._ws_task.done():
self._ws_task.cancel()
try:
await self._ws_task
except (asyncio.CancelledError, Exception):
pass
if self._reconnect_task and not self._reconnect_task.done():
self._reconnect_task.cancel()
if self._ws:
await self._ws.close()
self._ws = None
if self._session and not self._session.closed:
await self._session.close()
logger.info("Mattermost: disconnected")
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a message (or multiple chunks) to a channel."""
if not content:
return SendResult(success=True)
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, MAX_POST_LENGTH)
last_id = None
for chunk in chunks:
payload: Dict[str, Any] = {
"channel_id": chat_id,
"message": chunk,
}
# Thread support: reply_to is the root post ID.
if reply_to and self._reply_mode == "thread":
payload["root_id"] = reply_to
data = await self._api_post("posts", payload)
if not data or "id" not in data:
return SendResult(success=False, error="Failed to create post")
last_id = data["id"]
return SendResult(success=True, message_id=last_id)
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Return channel name and type."""
data = await self._api_get(f"channels/{chat_id}")
if not data:
return {"name": chat_id, "type": "channel"}
ch_type = _CHANNEL_TYPE_MAP.get(data.get("type", "O"), "channel")
display_name = data.get("display_name") or data.get("name") or chat_id
return {"name": display_name, "type": ch_type}
# ------------------------------------------------------------------
# Optional overrides
# ------------------------------------------------------------------
async def send_typing(
self, chat_id: str, metadata: Optional[Dict[str, Any]] = None
) -> None:
"""Send a typing indicator."""
await self._api_post(
f"users/{self._bot_user_id}/typing",
{"channel_id": chat_id},
)
async def edit_message(
self, chat_id: str, message_id: str, content: str
) -> SendResult:
"""Edit an existing post."""
formatted = self.format_message(content)
data = await self._api_put(
f"posts/{message_id}/patch",
{"message": formatted},
)
if not data or "id" not in data:
return SendResult(success=False, error="Failed to edit post")
return SendResult(success=True, message_id=data["id"])
async def send_image(
self,
chat_id: str,
image_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Download an image and upload it as a file attachment."""
return await self._send_url_as_file(
chat_id, image_url, caption, reply_to, "image"
)
async def send_image_file(
self,
chat_id: str,
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload a local image file."""
return await self._send_local_file(
chat_id, image_path, caption, reply_to
)
async def send_document(
self,
chat_id: str,
file_path: str,
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload a local file as a document."""
return await self._send_local_file(
chat_id, file_path, caption, reply_to, file_name
)
async def send_voice(
self,
chat_id: str,
audio_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload an audio file."""
return await self._send_local_file(
chat_id, audio_path, caption, reply_to
)
async def send_video(
self,
chat_id: str,
video_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload a video file."""
return await self._send_local_file(
chat_id, video_path, caption, reply_to
)
def format_message(self, content: str) -> str:
"""Mattermost uses standard Markdown — mostly pass through.
Strip image markdown into plain links (files are uploaded separately).
"""
# Convert ![alt](url) to just the URL — Mattermost renders
# image URLs as inline previews automatically.
content = re.sub(r"!\[([^\]]*)\]\(([^)]+)\)", r"\2", content)
return content
# ------------------------------------------------------------------
# File helpers
# ------------------------------------------------------------------
async def _send_url_as_file(
self,
chat_id: str,
url: str,
caption: Optional[str],
reply_to: Optional[str],
kind: str = "file",
) -> SendResult:
"""Download a URL and upload it as a file attachment."""
import aiohttp
try:
async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
if resp.status >= 400:
# Fall back to sending the URL as text.
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
file_data = await resp.read()
ct = resp.content_type or "application/octet-stream"
# Derive filename from URL.
fname = url.rsplit("/", 1)[-1].split("?")[0] or f"{kind}.png"
except Exception as exc:
logger.warning("Mattermost: failed to download %s: %s", url, exc)
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
file_id = await self._upload_file(chat_id, file_data, fname, ct)
if not file_id:
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
payload: Dict[str, Any] = {
"channel_id": chat_id,
"message": caption or "",
"file_ids": [file_id],
}
if reply_to and self._reply_mode == "thread":
payload["root_id"] = reply_to
data = await self._api_post("posts", payload)
if not data or "id" not in data:
return SendResult(success=False, error="Failed to post with file")
return SendResult(success=True, message_id=data["id"])
async def _send_local_file(
self,
chat_id: str,
file_path: str,
caption: Optional[str],
reply_to: Optional[str],
file_name: Optional[str] = None,
) -> SendResult:
"""Upload a local file and attach it to a post."""
import mimetypes
p = Path(file_path)
if not p.exists():
return await self.send(
chat_id, f"{caption or ''}\n(file not found: {file_path})", reply_to
)
fname = file_name or p.name
ct = mimetypes.guess_type(fname)[0] or "application/octet-stream"
file_data = p.read_bytes()
file_id = await self._upload_file(chat_id, file_data, fname, ct)
if not file_id:
return SendResult(success=False, error="File upload failed")
payload: Dict[str, Any] = {
"channel_id": chat_id,
"message": caption or "",
"file_ids": [file_id],
}
if reply_to and self._reply_mode == "thread":
payload["root_id"] = reply_to
data = await self._api_post("posts", payload)
if not data or "id" not in data:
return SendResult(success=False, error="Failed to post with file")
return SendResult(success=True, message_id=data["id"])
# ------------------------------------------------------------------
# WebSocket
# ------------------------------------------------------------------
async def _ws_loop(self) -> None:
"""Connect to the WebSocket and listen for events, reconnecting on failure."""
delay = _RECONNECT_BASE_DELAY
while not self._closing:
try:
await self._ws_connect_and_listen()
# Clean disconnect — reset delay.
delay = _RECONNECT_BASE_DELAY
except asyncio.CancelledError:
return
except Exception as exc:
if self._closing:
return
logger.warning("Mattermost WS error: %s — reconnecting in %.0fs", exc, delay)
if self._closing:
return
# Exponential backoff with jitter.
import random
jitter = delay * _RECONNECT_JITTER * random.random()
await asyncio.sleep(delay + jitter)
delay = min(delay * 2, _RECONNECT_MAX_DELAY)
async def _ws_connect_and_listen(self) -> None:
"""Single WebSocket session: connect, authenticate, process events."""
# Build WS URL: https:// → wss://, http:// → ws://
ws_url = re.sub(r"^http", "ws", self._base_url) + "/api/v4/websocket"
logger.info("Mattermost: connecting to %s", ws_url)
self._ws = await self._session.ws_connect(ws_url, heartbeat=30.0)
# Authenticate via the WebSocket.
auth_msg = {
"seq": 1,
"action": "authentication_challenge",
"data": {"token": self._token},
}
await self._ws.send_json(auth_msg)
logger.info("Mattermost: WebSocket connected and authenticated")
async for raw_msg in self._ws:
if self._closing:
return
if raw_msg.type in (
raw_msg.type.TEXT,
raw_msg.type.BINARY,
):
try:
event = json.loads(raw_msg.data)
except (json.JSONDecodeError, TypeError):
continue
await self._handle_ws_event(event)
elif raw_msg.type in (
raw_msg.type.ERROR,
raw_msg.type.CLOSE,
raw_msg.type.CLOSING,
raw_msg.type.CLOSED,
):
logger.info("Mattermost: WebSocket closed (%s)", raw_msg.type)
break
async def _handle_ws_event(self, event: Dict[str, Any]) -> None:
"""Process a single WebSocket event."""
event_type = event.get("event")
if event_type != "posted":
return
data = event.get("data", {})
raw_post_str = data.get("post")
if not raw_post_str:
return
try:
post = json.loads(raw_post_str)
except (json.JSONDecodeError, TypeError):
return
# Ignore own messages.
if post.get("user_id") == self._bot_user_id:
return
# Ignore system posts.
if post.get("type"):
return
post_id = post.get("id", "")
# Dedup.
self._prune_seen()
if post_id in self._seen_posts:
return
self._seen_posts[post_id] = time.time()
# Build message event.
channel_id = post.get("channel_id", "")
channel_type_raw = data.get("channel_type", "O")
chat_type = _CHANNEL_TYPE_MAP.get(channel_type_raw, "channel")
# For DMs, user_id is sufficient. For channels, check for @mention.
message_text = post.get("message", "")
# Mention-only mode: skip channel messages that don't @mention the bot.
# DMs (type "D") are always processed.
if channel_type_raw != "D":
mention_patterns = [
f"@{self._bot_username}",
f"@{self._bot_user_id}",
]
has_mention = any(
pattern.lower() in message_text.lower()
for pattern in mention_patterns
)
if not has_mention:
logger.debug(
"Mattermost: skipping non-DM message without @mention (channel=%s)",
channel_id,
)
return
# Resolve sender info.
sender_id = post.get("user_id", "")
sender_name = data.get("sender_name", "").lstrip("@") or sender_id
# Thread support: if the post is in a thread, use root_id.
thread_id = post.get("root_id") or None
# Determine message type.
file_ids = post.get("file_ids") or []
msg_type = MessageType.TEXT
if message_text.startswith("/"):
msg_type = MessageType.COMMAND
# Download file attachments immediately (URLs require auth headers
# that downstream tools won't have).
media_urls: List[str] = []
media_types: List[str] = []
for fid in file_ids:
try:
file_info = await self._api_get(f"files/{fid}/info")
fname = file_info.get("name", f"file_{fid}")
ext = Path(fname).suffix or ""
mime = file_info.get("mime_type", "application/octet-stream")
import aiohttp
dl_url = f"{self._base_url}/api/v4/files/{fid}"
async with self._session.get(
dl_url,
headers={"Authorization": f"Bearer {self._token}"},
timeout=aiohttp.ClientTimeout(total=30),
) as resp:
if resp.status < 400:
file_data = await resp.read()
from gateway.platforms.base import cache_image_from_bytes, cache_document_from_bytes
if mime.startswith("image/"):
local_path = cache_image_from_bytes(file_data, ext or ".png")
media_urls.append(local_path)
media_types.append(mime)
elif mime.startswith("audio/"):
from gateway.platforms.base import cache_audio_from_bytes
local_path = cache_audio_from_bytes(file_data, ext or ".ogg")
media_urls.append(local_path)
media_types.append(mime)
else:
local_path = cache_document_from_bytes(file_data, fname)
media_urls.append(local_path)
media_types.append(mime)
else:
logger.warning("Mattermost: failed to download file %s: HTTP %s", fid, resp.status)
except Exception as exc:
logger.warning("Mattermost: error downloading file %s: %s", fid, exc)
source = self.build_source(
chat_id=channel_id,
chat_type=chat_type,
user_id=sender_id,
user_name=sender_name,
thread_id=thread_id,
)
msg_event = MessageEvent(
text=message_text,
message_type=msg_type,
source=source,
raw_message=post,
message_id=post_id,
media_urls=media_urls if media_urls else None,
media_types=media_types if media_types else None,
)
await self.handle_message(msg_event)
def _prune_seen(self) -> None:
"""Remove expired entries from the dedup cache."""
if len(self._seen_posts) < self._SEEN_MAX:
return
now = time.time()
self._seen_posts = {
pid: ts
for pid, ts in self._seen_posts.items()
if now - ts < self._SEEN_TTL
}

View file

@ -0,0 +1,766 @@
"""Signal messenger platform adapter.
Connects to a signal-cli daemon running in HTTP mode.
Inbound messages arrive via SSE (Server-Sent Events) streaming.
Outbound messages and actions use JSON-RPC 2.0 over HTTP.
Based on PR #268 by ibhagwan, rebuilt with bug fixes.
Requires:
- signal-cli installed and running: signal-cli daemon --http 127.0.0.1:8080
- SIGNAL_HTTP_URL and SIGNAL_ACCOUNT environment variables set
"""
import asyncio
import base64
import json
import logging
import os
import random
import re
import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Dict, List, Optional, Any
from urllib.parse import unquote
import httpx
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
cache_image_from_bytes,
cache_audio_from_bytes,
cache_document_from_bytes,
cache_image_from_url,
)
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
SIGNAL_MAX_ATTACHMENT_SIZE = 100 * 1024 * 1024 # 100 MB
MAX_MESSAGE_LENGTH = 8000 # Signal message size limit
TYPING_INTERVAL = 8.0 # seconds between typing indicator refreshes
SSE_RETRY_DELAY_INITIAL = 2.0
SSE_RETRY_DELAY_MAX = 60.0
HEALTH_CHECK_INTERVAL = 30.0 # seconds between health checks
HEALTH_CHECK_STALE_THRESHOLD = 120.0 # seconds without SSE activity before concern
# E.164 phone number pattern for redaction
_PHONE_RE = re.compile(r"\+[1-9]\d{6,14}")
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _redact_phone(phone: str) -> str:
"""Redact a phone number for logging: +15551234567 -> +155****4567."""
if not phone:
return "<none>"
if len(phone) <= 8:
return phone[:2] + "****" + phone[-2:] if len(phone) > 4 else "****"
return phone[:4] + "****" + phone[-4:]
def _parse_comma_list(value: str) -> List[str]:
"""Split a comma-separated string into a list, stripping whitespace."""
return [v.strip() for v in value.split(",") if v.strip()]
def _guess_extension(data: bytes) -> str:
"""Guess file extension from magic bytes."""
if data[:4] == b"\x89PNG":
return ".png"
if data[:2] == b"\xff\xd8":
return ".jpg"
if data[:4] == b"GIF8":
return ".gif"
if len(data) >= 12 and data[:4] == b"RIFF" and data[8:12] == b"WEBP":
return ".webp"
if data[:4] == b"%PDF":
return ".pdf"
if len(data) >= 8 and data[4:8] == b"ftyp":
return ".mp4"
if data[:4] == b"OggS":
return ".ogg"
if len(data) >= 2 and data[0] == 0xFF and (data[1] & 0xE0) == 0xE0:
return ".mp3"
if data[:2] == b"PK":
return ".zip"
return ".bin"
def _is_image_ext(ext: str) -> bool:
return ext.lower() in (".jpg", ".jpeg", ".png", ".gif", ".webp")
def _is_audio_ext(ext: str) -> bool:
return ext.lower() in (".mp3", ".wav", ".ogg", ".m4a", ".aac")
_EXT_TO_MIME = {
".jpg": "image/jpeg", ".jpeg": "image/jpeg", ".png": "image/png",
".gif": "image/gif", ".webp": "image/webp",
".ogg": "audio/ogg", ".mp3": "audio/mpeg", ".wav": "audio/wav",
".m4a": "audio/mp4", ".aac": "audio/aac",
".mp4": "video/mp4", ".pdf": "application/pdf", ".zip": "application/zip",
}
def _ext_to_mime(ext: str) -> str:
"""Map file extension to MIME type."""
return _EXT_TO_MIME.get(ext.lower(), "application/octet-stream")
def _render_mentions(text: str, mentions: list) -> str:
"""Replace Signal mention placeholders (\\uFFFC) with readable @identifiers.
Signal encodes @mentions as the Unicode object replacement character
with out-of-band metadata containing the mentioned user's UUID/number.
"""
if not mentions or "\uFFFC" not in text:
return text
# Sort mentions by start position (reverse) to replace from end to start
# so indices don't shift as we replace
sorted_mentions = sorted(mentions, key=lambda m: m.get("start", 0), reverse=True)
for mention in sorted_mentions:
start = mention.get("start", 0)
length = mention.get("length", 1)
# Use the mention's number or UUID as the replacement
identifier = mention.get("number") or mention.get("uuid") or "user"
replacement = f"@{identifier}"
text = text[:start] + replacement + text[start + length:]
return text
def check_signal_requirements() -> bool:
"""Check if Signal is configured (has URL and account)."""
return bool(os.getenv("SIGNAL_HTTP_URL") and os.getenv("SIGNAL_ACCOUNT"))
# ---------------------------------------------------------------------------
# Signal Adapter
# ---------------------------------------------------------------------------
class SignalAdapter(BasePlatformAdapter):
"""Signal messenger adapter using signal-cli HTTP daemon."""
platform = Platform.SIGNAL
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.SIGNAL)
extra = config.extra or {}
self.http_url = extra.get("http_url", "http://127.0.0.1:8080").rstrip("/")
self.account = extra.get("account", "")
self.ignore_stories = extra.get("ignore_stories", True)
# Parse allowlists — group policy is derived from presence of group allowlist
group_allowed_str = os.getenv("SIGNAL_GROUP_ALLOWED_USERS", "")
self.group_allow_from = set(_parse_comma_list(group_allowed_str))
# HTTP client
self.client: Optional[httpx.AsyncClient] = None
# Background tasks
self._sse_task: Optional[asyncio.Task] = None
self._health_monitor_task: Optional[asyncio.Task] = None
self._typing_tasks: Dict[str, asyncio.Task] = {}
self._running = False
self._last_sse_activity = 0.0
self._sse_response: Optional[httpx.Response] = None
# Normalize account for self-message filtering
self._account_normalized = self.account.strip()
# Track recently sent message timestamps to prevent echo-back loops
# in Note to Self / self-chat mode (mirrors WhatsApp recentlySentIds)
self._recent_sent_timestamps: set = set()
self._max_recent_timestamps = 50
logger.info("Signal adapter initialized: url=%s account=%s groups=%s",
self.http_url, _redact_phone(self.account),
"enabled" if self.group_allow_from else "disabled")
# ------------------------------------------------------------------
# Lifecycle
# ------------------------------------------------------------------
async def connect(self) -> bool:
"""Connect to signal-cli daemon and start SSE listener."""
if not self.http_url or not self.account:
logger.error("Signal: SIGNAL_HTTP_URL and SIGNAL_ACCOUNT are required")
return False
self.client = httpx.AsyncClient(timeout=30.0)
# Health check — verify signal-cli daemon is reachable
try:
resp = await self.client.get(f"{self.http_url}/api/v1/check", timeout=10.0)
if resp.status_code != 200:
logger.error("Signal: health check failed (status %d)", resp.status_code)
return False
except Exception as e:
logger.error("Signal: cannot reach signal-cli at %s: %s", self.http_url, e)
return False
self._running = True
self._last_sse_activity = time.time()
self._sse_task = asyncio.create_task(self._sse_listener())
self._health_monitor_task = asyncio.create_task(self._health_monitor())
logger.info("Signal: connected to %s", self.http_url)
return True
async def disconnect(self) -> None:
"""Stop SSE listener and clean up."""
self._running = False
if self._sse_task:
self._sse_task.cancel()
try:
await self._sse_task
except asyncio.CancelledError:
pass
if self._health_monitor_task:
self._health_monitor_task.cancel()
try:
await self._health_monitor_task
except asyncio.CancelledError:
pass
# Cancel all typing tasks
for task in self._typing_tasks.values():
task.cancel()
self._typing_tasks.clear()
if self.client:
await self.client.aclose()
self.client = None
logger.info("Signal: disconnected")
# ------------------------------------------------------------------
# SSE Streaming (inbound messages)
# ------------------------------------------------------------------
async def _sse_listener(self) -> None:
"""Listen for SSE events from signal-cli daemon."""
url = f"{self.http_url}/api/v1/events?account={self.account}"
backoff = SSE_RETRY_DELAY_INITIAL
while self._running:
try:
logger.debug("Signal SSE: connecting to %s", url)
async with self.client.stream(
"GET", url,
headers={"Accept": "text/event-stream"},
timeout=None,
) as response:
self._sse_response = response
backoff = SSE_RETRY_DELAY_INITIAL # Reset on successful connection
self._last_sse_activity = time.time()
logger.info("Signal SSE: connected")
buffer = ""
async for chunk in response.aiter_text():
if not self._running:
break
buffer += chunk
while "\n" in buffer:
line, buffer = buffer.split("\n", 1)
line = line.strip()
if not line:
continue
# Parse SSE data lines
if line.startswith("data:"):
data_str = line[5:].strip()
if not data_str:
continue
self._last_sse_activity = time.time()
try:
data = json.loads(data_str)
await self._handle_envelope(data)
except json.JSONDecodeError:
logger.debug("Signal SSE: invalid JSON: %s", data_str[:100])
except Exception:
logger.exception("Signal SSE: error handling event")
except asyncio.CancelledError:
break
except httpx.HTTPError as e:
if self._running:
logger.warning("Signal SSE: HTTP error: %s (reconnecting in %.0fs)", e, backoff)
except Exception as e:
if self._running:
logger.warning("Signal SSE: error: %s (reconnecting in %.0fs)", e, backoff)
if self._running:
# Add 20% jitter to prevent thundering herd on reconnection
jitter = backoff * 0.2 * random.random()
await asyncio.sleep(backoff + jitter)
backoff = min(backoff * 2, SSE_RETRY_DELAY_MAX)
self._sse_response = None
# ------------------------------------------------------------------
# Health Monitor
# ------------------------------------------------------------------
async def _health_monitor(self) -> None:
"""Monitor SSE connection health and force reconnect if stale."""
while self._running:
await asyncio.sleep(HEALTH_CHECK_INTERVAL)
if not self._running:
break
elapsed = time.time() - self._last_sse_activity
if elapsed > HEALTH_CHECK_STALE_THRESHOLD:
logger.warning("Signal: SSE idle for %.0fs, checking daemon health", elapsed)
try:
resp = await self.client.get(
f"{self.http_url}/api/v1/check", timeout=10.0
)
if resp.status_code == 200:
# Daemon is alive but SSE is idle — update activity to
# avoid repeated warnings (connection may just be quiet)
self._last_sse_activity = time.time()
logger.debug("Signal: daemon healthy, SSE idle")
else:
logger.warning("Signal: health check failed (%d), forcing reconnect", resp.status_code)
self._force_reconnect()
except Exception as e:
logger.warning("Signal: health check error: %s, forcing reconnect", e)
self._force_reconnect()
def _force_reconnect(self) -> None:
"""Force SSE reconnection by closing the current response."""
if self._sse_response and not self._sse_response.is_stream_consumed:
try:
asyncio.create_task(self._sse_response.aclose())
except Exception:
pass
self._sse_response = None
# ------------------------------------------------------------------
# Message Handling
# ------------------------------------------------------------------
async def _handle_envelope(self, envelope: dict) -> None:
"""Process an incoming signal-cli envelope."""
# Unwrap nested envelope if present
envelope_data = envelope.get("envelope", envelope)
# Handle syncMessage: extract "Note to Self" messages (sent to own account)
# while still filtering other sync events (read receipts, typing, etc.)
is_note_to_self = False
if "syncMessage" in envelope_data:
sync_msg = envelope_data.get("syncMessage")
if sync_msg and isinstance(sync_msg, dict):
sent_msg = sync_msg.get("sentMessage")
if sent_msg and isinstance(sent_msg, dict):
dest = sent_msg.get("destinationNumber") or sent_msg.get("destination")
sent_ts = sent_msg.get("timestamp")
if dest == self._account_normalized:
# Check if this is an echo of our own outbound reply
if sent_ts and sent_ts in self._recent_sent_timestamps:
self._recent_sent_timestamps.discard(sent_ts)
return
# Genuine user Note to Self — promote to dataMessage
is_note_to_self = True
envelope_data = {**envelope_data, "dataMessage": sent_msg}
if not is_note_to_self:
return
# Extract sender info
sender = (
envelope_data.get("sourceNumber")
or envelope_data.get("sourceUuid")
or envelope_data.get("source")
)
sender_name = envelope_data.get("sourceName", "")
sender_uuid = envelope_data.get("sourceUuid", "")
if not sender:
logger.debug("Signal: ignoring envelope with no sender")
return
# Self-message filtering — prevent reply loops (but allow Note to Self)
if self._account_normalized and sender == self._account_normalized and not is_note_to_self:
return
# Filter stories
if self.ignore_stories and envelope_data.get("storyMessage"):
return
# Get data message — also check editMessage (edited messages contain
# their updated dataMessage inside editMessage.dataMessage)
data_message = (
envelope_data.get("dataMessage")
or (envelope_data.get("editMessage") or {}).get("dataMessage")
)
if not data_message:
return
# Check for group message
group_info = data_message.get("groupInfo")
group_id = group_info.get("groupId") if group_info else None
is_group = bool(group_id)
# Group message filtering — derived from SIGNAL_GROUP_ALLOWED_USERS:
# - No env var set → groups disabled (default safe behavior)
# - Env var set with group IDs → only those groups allowed
# - Env var set with "*" → all groups allowed
# DM auth is fully handled by run.py (_is_user_authorized)
if is_group:
if not self.group_allow_from:
logger.debug("Signal: ignoring group message (no SIGNAL_GROUP_ALLOWED_USERS)")
return
if "*" not in self.group_allow_from and group_id not in self.group_allow_from:
logger.debug("Signal: group %s not in allowlist", group_id[:8] if group_id else "?")
return
# Build chat info
chat_id = sender if not is_group else f"group:{group_id}"
chat_type = "group" if is_group else "dm"
# Extract text and render mentions
text = data_message.get("message", "")
mentions = data_message.get("mentions", [])
if text and mentions:
text = _render_mentions(text, mentions)
# Process attachments
attachments_data = data_message.get("attachments", [])
media_urls = []
media_types = []
if attachments_data and not getattr(self, "ignore_attachments", False):
for att in attachments_data:
att_id = att.get("id")
att_size = att.get("size", 0)
if not att_id:
continue
if att_size > SIGNAL_MAX_ATTACHMENT_SIZE:
logger.warning("Signal: attachment too large (%d bytes), skipping", att_size)
continue
try:
cached_path, ext = await self._fetch_attachment(att_id)
if cached_path:
# Use contentType from Signal if available, else map from extension
content_type = att.get("contentType") or _ext_to_mime(ext)
media_urls.append(cached_path)
media_types.append(content_type)
except Exception:
logger.exception("Signal: failed to fetch attachment %s", att_id)
# Build session source
source = self.build_source(
chat_id=chat_id,
chat_name=group_info.get("groupName") if group_info else sender_name,
chat_type=chat_type,
user_id=sender,
user_name=sender_name or sender,
user_id_alt=sender_uuid if sender_uuid else None,
chat_id_alt=group_id if is_group else None,
)
# Determine message type from media
msg_type = MessageType.TEXT
if media_types:
if any(mt.startswith("audio/") for mt in media_types):
msg_type = MessageType.VOICE
elif any(mt.startswith("image/") for mt in media_types):
msg_type = MessageType.PHOTO
# Parse timestamp from envelope data (milliseconds since epoch)
ts_ms = envelope_data.get("timestamp", 0)
if ts_ms:
try:
timestamp = datetime.fromtimestamp(ts_ms / 1000, tz=timezone.utc)
except (ValueError, OSError):
timestamp = datetime.now(tz=timezone.utc)
else:
timestamp = datetime.now(tz=timezone.utc)
# Build and dispatch event
event = MessageEvent(
source=source,
text=text or "",
message_type=msg_type,
media_urls=media_urls,
media_types=media_types,
timestamp=timestamp,
)
logger.debug("Signal: message from %s in %s: %s",
_redact_phone(sender), chat_id[:20], (text or "")[:50])
await self.handle_message(event)
# ------------------------------------------------------------------
# Attachment Handling
# ------------------------------------------------------------------
async def _fetch_attachment(self, attachment_id: str) -> tuple:
"""Fetch an attachment via JSON-RPC and cache it. Returns (path, ext)."""
result = await self._rpc("getAttachment", {
"account": self.account,
"attachmentId": attachment_id,
})
if not result:
return None, ""
# Handle dict response (signal-cli returns {"data": "base64..."})
if isinstance(result, dict):
result = result.get("data")
if not result:
logger.warning("Signal: attachment response missing 'data' key")
return None, ""
# Result is base64-encoded file content
raw_data = base64.b64decode(result)
ext = _guess_extension(raw_data)
if _is_image_ext(ext):
path = cache_image_from_bytes(raw_data, ext)
elif _is_audio_ext(ext):
path = cache_audio_from_bytes(raw_data, ext)
else:
path = cache_document_from_bytes(raw_data, ext)
return path, ext
# ------------------------------------------------------------------
# JSON-RPC Communication
# ------------------------------------------------------------------
async def _rpc(self, method: str, params: dict, rpc_id: str = None) -> Any:
"""Send a JSON-RPC 2.0 request to signal-cli daemon."""
if not self.client:
logger.warning("Signal: RPC called but client not connected")
return None
if rpc_id is None:
rpc_id = f"{method}_{int(time.time() * 1000)}"
payload = {
"jsonrpc": "2.0",
"method": method,
"params": params,
"id": rpc_id,
}
try:
resp = await self.client.post(
f"{self.http_url}/api/v1/rpc",
json=payload,
timeout=30.0,
)
resp.raise_for_status()
data = resp.json()
if "error" in data:
logger.warning("Signal RPC error (%s): %s", method, data["error"])
return None
return data.get("result")
except Exception as e:
logger.warning("Signal RPC %s failed: %s", method, e)
return None
# ------------------------------------------------------------------
# Sending
# ------------------------------------------------------------------
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a text message."""
await self._stop_typing_indicator(chat_id)
params: Dict[str, Any] = {
"account": self.account,
"message": content,
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
result = await self._rpc("send", params)
if result is not None:
self._track_sent_timestamp(result)
return SendResult(success=True)
return SendResult(success=False, error="RPC send failed")
def _track_sent_timestamp(self, rpc_result) -> None:
"""Record outbound message timestamp for echo-back filtering."""
ts = rpc_result.get("timestamp") if isinstance(rpc_result, dict) else None
if ts:
self._recent_sent_timestamps.add(ts)
if len(self._recent_sent_timestamps) > self._max_recent_timestamps:
self._recent_sent_timestamps.pop()
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Send a typing indicator."""
params: Dict[str, Any] = {
"account": self.account,
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
await self._rpc("sendTyping", params, rpc_id="typing")
async def send_image(
self,
chat_id: str,
image_url: str,
caption: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send an image. Supports http(s):// and file:// URLs."""
await self._stop_typing_indicator(chat_id)
# Resolve image to local path
if image_url.startswith("file://"):
file_path = unquote(image_url[7:])
else:
# Download remote image to cache
try:
file_path = await cache_image_from_url(image_url)
except Exception as e:
logger.warning("Signal: failed to download image: %s", e)
return SendResult(success=False, error=str(e))
if not file_path or not Path(file_path).exists():
return SendResult(success=False, error="Image file not found")
# Validate size
file_size = Path(file_path).stat().st_size
if file_size > SIGNAL_MAX_ATTACHMENT_SIZE:
return SendResult(success=False, error=f"Image too large ({file_size} bytes)")
params: Dict[str, Any] = {
"account": self.account,
"message": caption or "",
"attachments": [file_path],
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
result = await self._rpc("send", params)
if result is not None:
self._track_sent_timestamp(result)
return SendResult(success=True)
return SendResult(success=False, error="RPC send with attachment failed")
async def send_document(
self,
chat_id: str,
file_path: str,
caption: Optional[str] = None,
filename: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a document/file attachment."""
await self._stop_typing_indicator(chat_id)
if not Path(file_path).exists():
return SendResult(success=False, error="File not found")
params: Dict[str, Any] = {
"account": self.account,
"message": caption or "",
"attachments": [file_path],
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
result = await self._rpc("send", params)
if result is not None:
self._track_sent_timestamp(result)
return SendResult(success=True)
return SendResult(success=False, error="RPC send document failed")
# ------------------------------------------------------------------
# Typing Indicators
# ------------------------------------------------------------------
async def _start_typing_indicator(self, chat_id: str) -> None:
"""Start a typing indicator loop for a chat."""
if chat_id in self._typing_tasks:
return # Already running
async def _typing_loop():
try:
while True:
await self.send_typing(chat_id)
await asyncio.sleep(TYPING_INTERVAL)
except asyncio.CancelledError:
pass
self._typing_tasks[chat_id] = asyncio.create_task(_typing_loop())
async def _stop_typing_indicator(self, chat_id: str) -> None:
"""Stop a typing indicator loop for a chat."""
task = self._typing_tasks.pop(chat_id, None)
if task:
task.cancel()
try:
await task
except asyncio.CancelledError:
pass
# ------------------------------------------------------------------
# Chat Info
# ------------------------------------------------------------------
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a chat/contact."""
if chat_id.startswith("group:"):
return {
"name": chat_id,
"type": "group",
"chat_id": chat_id,
}
# Try to resolve contact name
result = await self._rpc("getContact", {
"account": self.account,
"contactAddress": chat_id,
})
name = chat_id
if result and isinstance(result, dict):
name = result.get("name") or result.get("profileName") or chat_id
return {
"name": name,
"type": "dm",
"chat_id": chat_id,
}

View file

@ -0,0 +1,852 @@
"""
Slack platform adapter.
Uses slack-bolt (Python) with Socket Mode for:
- Receiving messages from channels and DMs
- Sending responses back
- Handling slash commands
- Thread support
"""
import asyncio
import logging
import os
import re
from typing import Dict, List, Optional, Any
try:
from slack_bolt.async_app import AsyncApp
from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
from slack_sdk.web.async_client import AsyncWebClient
SLACK_AVAILABLE = True
except ImportError:
SLACK_AVAILABLE = False
AsyncApp = Any
AsyncSocketModeHandler = Any
AsyncWebClient = Any
import sys
from pathlib import Path as _Path
sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
SUPPORTED_DOCUMENT_TYPES,
cache_document_from_bytes,
cache_image_from_url,
cache_audio_from_url,
)
logger = logging.getLogger(__name__)
def check_slack_requirements() -> bool:
"""Check if Slack dependencies are available."""
return SLACK_AVAILABLE
class SlackAdapter(BasePlatformAdapter):
"""
Slack bot adapter using Socket Mode.
Requires two tokens:
- SLACK_BOT_TOKEN (xoxb-...) for API calls
- SLACK_APP_TOKEN (xapp-...) for Socket Mode connection
Features:
- DMs and channel messages (mention-gated in channels)
- Thread support
- File/image/audio attachments
- Slash commands (/hermes)
- Typing indicators (not natively supported by Slack bots)
"""
MAX_MESSAGE_LENGTH = 39000 # Slack API allows 40,000 chars; leave margin
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.SLACK)
self._app: Optional[AsyncApp] = None
self._handler: Optional[AsyncSocketModeHandler] = None
self._bot_user_id: Optional[str] = None
self._user_name_cache: Dict[str, str] = {} # user_id → display name
async def connect(self) -> bool:
"""Connect to Slack via Socket Mode."""
if not SLACK_AVAILABLE:
logger.error(
"[Slack] slack-bolt not installed. Run: pip install slack-bolt",
)
return False
bot_token = self.config.token
app_token = os.getenv("SLACK_APP_TOKEN")
if not bot_token:
logger.error("[Slack] SLACK_BOT_TOKEN not set")
return False
if not app_token:
logger.error("[Slack] SLACK_APP_TOKEN not set")
return False
try:
self._app = AsyncApp(token=bot_token)
# Get our own bot user ID for mention detection
auth_response = await self._app.client.auth_test()
self._bot_user_id = auth_response.get("user_id")
bot_name = auth_response.get("user", "unknown")
# Register message event handler
@self._app.event("message")
async def handle_message_event(event, say):
await self._handle_slack_message(event)
# Acknowledge app_mention events to prevent Bolt 404 errors.
# The "message" handler above already processes @mentions in
# channels, so this is intentionally a no-op to avoid duplicates.
@self._app.event("app_mention")
async def handle_app_mention(event, say):
pass
# Register slash command handler
@self._app.command("/hermes")
async def handle_hermes_command(ack, command):
await ack()
await self._handle_slash_command(command)
# Start Socket Mode handler in background
self._handler = AsyncSocketModeHandler(self._app, app_token)
asyncio.create_task(self._handler.start_async())
self._running = True
logger.info("[Slack] Connected as @%s (Socket Mode)", bot_name)
return True
except Exception as e: # pragma: no cover - defensive logging
logger.error("[Slack] Connection failed: %s", e, exc_info=True)
return False
async def disconnect(self) -> None:
"""Disconnect from Slack."""
if self._handler:
try:
await self._handler.close_async()
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[Slack] Error while closing Socket Mode handler: %s", e, exc_info=True)
self._running = False
logger.info("[Slack] Disconnected")
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a message to a Slack channel or DM."""
if not self._app:
return SendResult(success=False, error="Not connected")
try:
# Convert standard markdown → Slack mrkdwn
formatted = self.format_message(content)
# Split long messages, preserving code block boundaries
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
thread_ts = self._resolve_thread_ts(reply_to, metadata)
last_result = None
# reply_broadcast: also post thread replies to the main channel.
# Controlled via platform config: gateway.slack.reply_broadcast
broadcast = self.config.extra.get("reply_broadcast", False)
for i, chunk in enumerate(chunks):
kwargs = {
"channel": chat_id,
"text": chunk,
}
if thread_ts:
kwargs["thread_ts"] = thread_ts
# Only broadcast the first chunk of the first reply
if broadcast and i == 0:
kwargs["reply_broadcast"] = True
last_result = await self._app.client.chat_postMessage(**kwargs)
return SendResult(
success=True,
message_id=last_result.get("ts") if last_result else None,
raw_response=last_result,
)
except Exception as e: # pragma: no cover - defensive logging
logger.error("[Slack] Send error: %s", e, exc_info=True)
return SendResult(success=False, error=str(e))
async def edit_message(
self,
chat_id: str,
message_id: str,
content: str,
) -> SendResult:
"""Edit a previously sent Slack message."""
if not self._app:
return SendResult(success=False, error="Not connected")
try:
await self._app.client.chat_update(
channel=chat_id,
ts=message_id,
text=content,
)
return SendResult(success=True, message_id=message_id)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[Slack] Failed to edit message %s in channel %s: %s",
message_id,
chat_id,
e,
exc_info=True,
)
return SendResult(success=False, error=str(e))
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Show a typing/status indicator using assistant.threads.setStatus.
Displays "is thinking..." next to the bot name in a thread.
Requires the assistant:write or chat:write scope.
Auto-clears when the bot sends a reply to the thread.
"""
if not self._app:
return
thread_ts = None
if metadata:
thread_ts = metadata.get("thread_id") or metadata.get("thread_ts")
if not thread_ts:
return # Can only set status in a thread context
try:
await self._app.client.assistant_threads_setStatus(
channel_id=chat_id,
thread_ts=thread_ts,
status="is thinking...",
)
except Exception as e:
# Silently ignore — may lack assistant:write scope or not be
# in an assistant-enabled context. Falls back to reactions.
logger.debug("[Slack] assistant.threads.setStatus failed: %s", e)
def _resolve_thread_ts(
self,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> Optional[str]:
"""Resolve the correct thread_ts for a Slack API call.
Prefers metadata thread_id (the thread parent's ts, set by the
gateway) over reply_to (which may be a child message's ts).
"""
if metadata:
if metadata.get("thread_id"):
return metadata["thread_id"]
if metadata.get("thread_ts"):
return metadata["thread_ts"]
return reply_to
async def _upload_file(
self,
chat_id: str,
file_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload a local file to Slack."""
if not self._app:
return SendResult(success=False, error="Not connected")
if not os.path.exists(file_path):
raise FileNotFoundError(f"File not found: {file_path}")
result = await self._app.client.files_upload_v2(
channel=chat_id,
file=file_path,
filename=os.path.basename(file_path),
initial_comment=caption or "",
thread_ts=self._resolve_thread_ts(reply_to, metadata),
)
return SendResult(success=True, raw_response=result)
# ----- Markdown → mrkdwn conversion -----
def format_message(self, content: str) -> str:
"""Convert standard markdown to Slack mrkdwn format.
Protected regions (code blocks, inline code) are extracted first so
their contents are never modified. Standard markdown constructs
(headers, bold, italic, links) are translated to mrkdwn syntax.
"""
if not content:
return content
placeholders: dict = {}
counter = [0]
def _ph(value: str) -> str:
"""Stash value behind a placeholder that survives later passes."""
key = f"\x00SL{counter[0]}\x00"
counter[0] += 1
placeholders[key] = value
return key
text = content
# 1) Protect fenced code blocks (``` ... ```)
text = re.sub(
r'(```(?:[^\n]*\n)?[\s\S]*?```)',
lambda m: _ph(m.group(0)),
text,
)
# 2) Protect inline code (`...`)
text = re.sub(r'(`[^`]+`)', lambda m: _ph(m.group(0)), text)
# 3) Convert markdown links [text](url) → <url|text>
text = re.sub(
r'\[([^\]]+)\]\(([^)]+)\)',
lambda m: _ph(f'<{m.group(2)}|{m.group(1)}>'),
text,
)
# 4) Convert headers (## Title) → *Title* (bold)
def _convert_header(m):
inner = m.group(1).strip()
# Strip redundant bold markers inside a header
inner = re.sub(r'\*\*(.+?)\*\*', r'\1', inner)
return _ph(f'*{inner}*')
text = re.sub(
r'^#{1,6}\s+(.+)$', _convert_header, text, flags=re.MULTILINE
)
# 5) Convert bold: **text** → *text* (Slack bold)
text = re.sub(
r'\*\*(.+?)\*\*',
lambda m: _ph(f'*{m.group(1)}*'),
text,
)
# 6) Convert italic: _text_ stays as _text_ (already Slack italic)
# Single *text* → _text_ (Slack italic)
text = re.sub(
r'(?<!\*)\*([^*\n]+)\*(?!\*)',
lambda m: _ph(f'_{m.group(1)}_'),
text,
)
# 7) Convert strikethrough: ~~text~~ → ~text~
text = re.sub(
r'~~(.+?)~~',
lambda m: _ph(f'~{m.group(1)}~'),
text,
)
# 8) Convert blockquotes: > text → > text (same syntax, just ensure
# no extra escaping happens to the > character)
# Slack uses the same > prefix, so this is a no-op for content.
# 9) Restore placeholders in reverse order
for key in reversed(list(placeholders.keys())):
text = text.replace(key, placeholders[key])
return text
# ----- Reactions -----
async def _add_reaction(
self, channel: str, timestamp: str, emoji: str
) -> bool:
"""Add an emoji reaction to a message. Returns True on success."""
if not self._app:
return False
try:
await self._app.client.reactions_add(
channel=channel, timestamp=timestamp, name=emoji
)
return True
except Exception as e:
# Don't log as error — may fail if already reacted or missing scope
logger.debug("[Slack] reactions.add failed (%s): %s", emoji, e)
return False
async def _remove_reaction(
self, channel: str, timestamp: str, emoji: str
) -> bool:
"""Remove an emoji reaction from a message. Returns True on success."""
if not self._app:
return False
try:
await self._app.client.reactions_remove(
channel=channel, timestamp=timestamp, name=emoji
)
return True
except Exception as e:
logger.debug("[Slack] reactions.remove failed (%s): %s", emoji, e)
return False
# ----- User identity resolution -----
async def _resolve_user_name(self, user_id: str) -> str:
"""Resolve a Slack user ID to a display name, with caching."""
if not user_id:
return ""
if user_id in self._user_name_cache:
return self._user_name_cache[user_id]
if not self._app:
return user_id
try:
result = await self._app.client.users_info(user=user_id)
user = result.get("user", {})
# Prefer display_name → real_name → user_id
profile = user.get("profile", {})
name = (
profile.get("display_name")
or profile.get("real_name")
or user.get("real_name")
or user.get("name")
or user_id
)
self._user_name_cache[user_id] = name
return name
except Exception as e:
logger.debug("[Slack] users.info failed for %s: %s", user_id, e)
self._user_name_cache[user_id] = user_id
return user_id
async def send_image_file(
self,
chat_id: str,
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a local image file to Slack by uploading it."""
try:
return await self._upload_file(chat_id, image_path, caption, reply_to, metadata)
except FileNotFoundError:
return SendResult(success=False, error=f"Image file not found: {image_path}")
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[%s] Failed to send local Slack image %s: %s",
self.name,
image_path,
e,
exc_info=True,
)
text = f"🖼️ Image: {image_path}"
if caption:
text = f"{caption}\n{text}"
return await self.send(chat_id, text, reply_to=reply_to, metadata=metadata)
async def send_image(
self,
chat_id: str,
image_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an image to Slack by uploading the URL as a file."""
if not self._app:
return SendResult(success=False, error="Not connected")
try:
import httpx
# Download the image first
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
response = await client.get(image_url)
response.raise_for_status()
result = await self._app.client.files_upload_v2(
channel=chat_id,
content=response.content,
filename="image.png",
initial_comment=caption or "",
thread_ts=self._resolve_thread_ts(reply_to, metadata),
)
return SendResult(success=True, raw_response=result)
except Exception as e: # pragma: no cover - defensive logging
logger.warning(
"[Slack] Failed to upload image from URL %s, falling back to text: %s",
image_url,
e,
exc_info=True,
)
# Fall back to sending the URL as text
text = f"{caption}\n{image_url}" if caption else image_url
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to)
async def send_voice(
self,
chat_id: str,
audio_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs,
) -> SendResult:
"""Send an audio file to Slack."""
try:
return await self._upload_file(chat_id, audio_path, caption, reply_to, metadata)
except FileNotFoundError:
return SendResult(success=False, error=f"Audio file not found: {audio_path}")
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[Slack] Failed to send audio file %s: %s",
audio_path,
e,
exc_info=True,
)
return SendResult(success=False, error=str(e))
async def send_video(
self,
chat_id: str,
video_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a video file to Slack."""
if not self._app:
return SendResult(success=False, error="Not connected")
if not os.path.exists(video_path):
return SendResult(success=False, error=f"Video file not found: {video_path}")
try:
result = await self._app.client.files_upload_v2(
channel=chat_id,
file=video_path,
filename=os.path.basename(video_path),
initial_comment=caption or "",
thread_ts=self._resolve_thread_ts(reply_to, metadata),
)
return SendResult(success=True, raw_response=result)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[%s] Failed to send video %s: %s",
self.name,
video_path,
e,
exc_info=True,
)
text = f"🎬 Video: {video_path}"
if caption:
text = f"{caption}\n{text}"
return await self.send(chat_id, text, reply_to=reply_to, metadata=metadata)
async def send_document(
self,
chat_id: str,
file_path: str,
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a document/file attachment to Slack."""
if not self._app:
return SendResult(success=False, error="Not connected")
if not os.path.exists(file_path):
return SendResult(success=False, error=f"File not found: {file_path}")
display_name = file_name or os.path.basename(file_path)
try:
result = await self._app.client.files_upload_v2(
channel=chat_id,
file=file_path,
filename=display_name,
initial_comment=caption or "",
thread_ts=self._resolve_thread_ts(reply_to, metadata),
)
return SendResult(success=True, raw_response=result)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[%s] Failed to send document %s: %s",
self.name,
file_path,
e,
exc_info=True,
)
text = f"📎 File: {file_path}"
if caption:
text = f"{caption}\n{text}"
return await self.send(chat_id, text, reply_to=reply_to, metadata=metadata)
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a Slack channel."""
if not self._app:
return {"name": chat_id, "type": "unknown"}
try:
result = await self._app.client.conversations_info(channel=chat_id)
channel = result.get("channel", {})
is_dm = channel.get("is_im", False)
return {
"name": channel.get("name", chat_id),
"type": "dm" if is_dm else "group",
}
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[Slack] Failed to fetch chat info for %s: %s",
chat_id,
e,
exc_info=True,
)
return {"name": chat_id, "type": "unknown"}
# ----- Internal handlers -----
async def _handle_slack_message(self, event: dict) -> None:
"""Handle an incoming Slack message event."""
# Ignore bot messages (including our own)
if event.get("bot_id") or event.get("subtype") == "bot_message":
return
# Ignore message edits and deletions
subtype = event.get("subtype")
if subtype in ("message_changed", "message_deleted"):
return
text = event.get("text", "")
user_id = event.get("user", "")
channel_id = event.get("channel", "")
ts = event.get("ts", "")
# Determine if this is a DM or channel message
channel_type = event.get("channel_type", "")
is_dm = channel_type == "im"
# Build thread_ts for session keying.
# In channels: fall back to ts so each top-level @mention starts a
# new thread/session (the bot always replies in a thread).
# In DMs: only use the real thread_ts — top-level DMs should share
# one continuous session, threaded DMs get their own session.
if is_dm:
thread_ts = event.get("thread_ts") # None for top-level DMs
else:
thread_ts = event.get("thread_ts") or ts # ts fallback for channels
# In channels, only respond if bot is mentioned
if not is_dm and self._bot_user_id:
if f"<@{self._bot_user_id}>" not in text:
return
# Strip the bot mention from the text
text = text.replace(f"<@{self._bot_user_id}>", "").strip()
# Determine message type
msg_type = MessageType.TEXT
if text.startswith("/"):
msg_type = MessageType.COMMAND
# Handle file attachments
media_urls = []
media_types = []
files = event.get("files", [])
for f in files:
mimetype = f.get("mimetype", "unknown")
url = f.get("url_private_download") or f.get("url_private", "")
if mimetype.startswith("image/") and url:
try:
ext = "." + mimetype.split("/")[-1].split(";")[0]
if ext not in (".jpg", ".jpeg", ".png", ".gif", ".webp"):
ext = ".jpg"
# Slack private URLs require the bot token as auth header
cached = await self._download_slack_file(url, ext)
media_urls.append(cached)
media_types.append(mimetype)
msg_type = MessageType.PHOTO
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[Slack] Failed to cache image from %s: %s", url, e, exc_info=True)
elif mimetype.startswith("audio/") and url:
try:
ext = "." + mimetype.split("/")[-1].split(";")[0]
if ext not in (".ogg", ".mp3", ".wav", ".webm", ".m4a"):
ext = ".ogg"
cached = await self._download_slack_file(url, ext, audio=True)
media_urls.append(cached)
media_types.append(mimetype)
msg_type = MessageType.VOICE
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[Slack] Failed to cache audio from %s: %s", url, e, exc_info=True)
elif url:
# Try to handle as a document attachment
try:
original_filename = f.get("name", "")
ext = ""
if original_filename:
_, ext = os.path.splitext(original_filename)
ext = ext.lower()
# Fallback: reverse-lookup from MIME type
if not ext and mimetype:
mime_to_ext = {v: k for k, v in SUPPORTED_DOCUMENT_TYPES.items()}
ext = mime_to_ext.get(mimetype, "")
if ext not in SUPPORTED_DOCUMENT_TYPES:
continue # Skip unsupported file types silently
# Check file size (Slack limit: 20 MB for bots)
file_size = f.get("size", 0)
MAX_DOC_BYTES = 20 * 1024 * 1024
if not file_size or file_size > MAX_DOC_BYTES:
logger.warning("[Slack] Document too large or unknown size: %s", file_size)
continue
# Download and cache
raw_bytes = await self._download_slack_file_bytes(url)
cached_path = cache_document_from_bytes(
raw_bytes, original_filename or f"document{ext}"
)
doc_mime = SUPPORTED_DOCUMENT_TYPES[ext]
media_urls.append(cached_path)
media_types.append(doc_mime)
msg_type = MessageType.DOCUMENT
logger.debug("[Slack] Cached user document: %s", cached_path)
# Inject text content for .txt/.md files (capped at 100 KB)
MAX_TEXT_INJECT_BYTES = 100 * 1024
if ext in (".md", ".txt") and len(raw_bytes) <= MAX_TEXT_INJECT_BYTES:
try:
text_content = raw_bytes.decode("utf-8")
display_name = original_filename or f"document{ext}"
display_name = re.sub(r'[^\w.\- ]', '_', display_name)
injection = f"[Content of {display_name}]:\n{text_content}"
if text:
text = f"{injection}\n\n{text}"
else:
text = injection
except UnicodeDecodeError:
pass # Binary content, skip injection
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[Slack] Failed to cache document from %s: %s", url, e, exc_info=True)
# Resolve user display name (cached after first lookup)
user_name = await self._resolve_user_name(user_id)
# Build source
source = self.build_source(
chat_id=channel_id,
chat_name=channel_id, # Will be resolved later if needed
chat_type="dm" if is_dm else "group",
user_id=user_id,
user_name=user_name,
thread_id=thread_ts,
)
msg_event = MessageEvent(
text=text,
message_type=msg_type,
source=source,
raw_message=event,
message_id=ts,
media_urls=media_urls,
media_types=media_types,
reply_to_message_id=thread_ts if thread_ts != ts else None,
)
# Add 👀 reaction to acknowledge receipt
await self._add_reaction(channel_id, ts, "eyes")
await self.handle_message(msg_event)
# Replace 👀 with ✅ when done
await self._remove_reaction(channel_id, ts, "eyes")
await self._add_reaction(channel_id, ts, "white_check_mark")
async def _handle_slash_command(self, command: dict) -> None:
"""Handle /hermes slash command."""
text = command.get("text", "").strip()
user_id = command.get("user_id", "")
channel_id = command.get("channel_id", "")
# Map subcommands to gateway commands — derived from central registry.
# Also keep "compact" as a Slack-specific alias for /compress.
from hermes_cli.commands import slack_subcommand_map
subcommand_map = slack_subcommand_map()
subcommand_map["compact"] = "/compress"
first_word = text.split()[0] if text else ""
if first_word in subcommand_map:
# Preserve arguments after the subcommand
rest = text[len(first_word):].strip()
text = f"{subcommand_map[first_word]} {rest}".strip() if rest else subcommand_map[first_word]
elif text:
pass # Treat as a regular question
else:
text = "/help"
source = self.build_source(
chat_id=channel_id,
chat_type="dm", # Slash commands are always in DM-like context
user_id=user_id,
)
event = MessageEvent(
text=text,
message_type=MessageType.COMMAND if text.startswith("/") else MessageType.TEXT,
source=source,
raw_message=command,
)
await self.handle_message(event)
async def _download_slack_file(self, url: str, ext: str, audio: bool = False) -> str:
"""Download a Slack file using the bot token for auth."""
import httpx
bot_token = self.config.token
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
response = await client.get(
url,
headers={"Authorization": f"Bearer {bot_token}"},
)
response.raise_for_status()
if audio:
from gateway.platforms.base import cache_audio_from_bytes
return cache_audio_from_bytes(response.content, ext)
else:
from gateway.platforms.base import cache_image_from_bytes
return cache_image_from_bytes(response.content, ext)
async def _download_slack_file_bytes(self, url: str) -> bytes:
"""Download a Slack file and return raw bytes."""
import httpx
bot_token = self.config.token
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
response = await client.get(
url,
headers={"Authorization": f"Bearer {bot_token}"},
)
response.raise_for_status()
return response.content

View file

@ -0,0 +1,271 @@
"""SMS (Twilio) platform adapter.
Connects to the Twilio REST API for outbound SMS and runs an aiohttp
webhook server to receive inbound messages.
Shares credentials with the optional telephony skill same env vars:
- TWILIO_ACCOUNT_SID
- TWILIO_AUTH_TOKEN
- TWILIO_PHONE_NUMBER (E.164 from-number, e.g. +15551234567)
Gateway-specific env vars:
- SMS_WEBHOOK_PORT (default 8080)
- SMS_ALLOWED_USERS (comma-separated E.164 phone numbers)
- SMS_ALLOW_ALL_USERS (true/false)
- SMS_HOME_CHANNEL (phone number for cron delivery)
"""
import asyncio
import base64
import json
import logging
import os
import re
import urllib.parse
from typing import Any, Dict, List, Optional
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
)
logger = logging.getLogger(__name__)
TWILIO_API_BASE = "https://api.twilio.com/2010-04-01/Accounts"
MAX_SMS_LENGTH = 1600 # ~10 SMS segments
DEFAULT_WEBHOOK_PORT = 8080
# E.164 phone number pattern for redaction
_PHONE_RE = re.compile(r"\+[1-9]\d{6,14}")
def _redact_phone(phone: str) -> str:
"""Redact a phone number for logging: +15551234567 -> +1555***4567."""
if not phone:
return "<none>"
if len(phone) <= 8:
return phone[:2] + "***" + phone[-2:] if len(phone) > 4 else "****"
return phone[:5] + "***" + phone[-4:]
def check_sms_requirements() -> bool:
"""Check if SMS adapter dependencies are available."""
try:
import aiohttp # noqa: F401
except ImportError:
return False
return bool(os.getenv("TWILIO_ACCOUNT_SID") and os.getenv("TWILIO_AUTH_TOKEN"))
class SmsAdapter(BasePlatformAdapter):
"""
Twilio SMS <-> Hermes gateway adapter.
Each inbound phone number gets its own Hermes session (multi-tenant).
Replies are always sent from the configured TWILIO_PHONE_NUMBER.
"""
MAX_MESSAGE_LENGTH = MAX_SMS_LENGTH
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.SMS)
self._account_sid: str = os.environ["TWILIO_ACCOUNT_SID"]
self._auth_token: str = os.environ["TWILIO_AUTH_TOKEN"]
self._from_number: str = os.getenv("TWILIO_PHONE_NUMBER", "")
self._webhook_port: int = int(
os.getenv("SMS_WEBHOOK_PORT", str(DEFAULT_WEBHOOK_PORT))
)
self._runner = None
self._http_session: Optional["aiohttp.ClientSession"] = None
def _basic_auth_header(self) -> str:
"""Build HTTP Basic auth header value for Twilio."""
creds = f"{self._account_sid}:{self._auth_token}"
encoded = base64.b64encode(creds.encode("ascii")).decode("ascii")
return f"Basic {encoded}"
# ------------------------------------------------------------------
# Required abstract methods
# ------------------------------------------------------------------
async def connect(self) -> bool:
import aiohttp
from aiohttp import web
if not self._from_number:
logger.error("[sms] TWILIO_PHONE_NUMBER not set — cannot send replies")
return False
app = web.Application()
app.router.add_post("/webhooks/twilio", self._handle_webhook)
app.router.add_get("/health", lambda _: web.Response(text="ok"))
self._runner = web.AppRunner(app)
await self._runner.setup()
site = web.TCPSite(self._runner, "0.0.0.0", self._webhook_port)
await site.start()
self._http_session = aiohttp.ClientSession()
self._running = True
logger.info(
"[sms] Twilio webhook server listening on port %d, from: %s",
self._webhook_port,
_redact_phone(self._from_number),
)
return True
async def disconnect(self) -> None:
if self._http_session:
await self._http_session.close()
self._http_session = None
if self._runner:
await self._runner.cleanup()
self._runner = None
self._running = False
logger.info("[sms] Disconnected")
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
import aiohttp
formatted = self.format_message(content)
chunks = self.truncate_message(formatted)
last_result = SendResult(success=True)
url = f"{TWILIO_API_BASE}/{self._account_sid}/Messages.json"
headers = {
"Authorization": self._basic_auth_header(),
}
session = self._http_session or aiohttp.ClientSession()
try:
for chunk in chunks:
form_data = aiohttp.FormData()
form_data.add_field("From", self._from_number)
form_data.add_field("To", chat_id)
form_data.add_field("Body", chunk)
try:
async with session.post(url, data=form_data, headers=headers) as resp:
body = await resp.json()
if resp.status >= 400:
error_msg = body.get("message", str(body))
logger.error(
"[sms] send failed to %s: %s %s",
_redact_phone(chat_id),
resp.status,
error_msg,
)
return SendResult(
success=False,
error=f"Twilio {resp.status}: {error_msg}",
)
msg_sid = body.get("sid", "")
last_result = SendResult(success=True, message_id=msg_sid)
except Exception as e:
logger.error("[sms] send error to %s: %s", _redact_phone(chat_id), e)
return SendResult(success=False, error=str(e))
finally:
# Close session only if we created a fallback (no persistent session)
if not self._http_session and session:
await session.close()
return last_result
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
return {"name": chat_id, "type": "dm"}
# ------------------------------------------------------------------
# SMS-specific formatting
# ------------------------------------------------------------------
def format_message(self, content: str) -> str:
"""Strip markdown — SMS renders it as literal characters."""
content = re.sub(r"\*\*(.+?)\*\*", r"\1", content, flags=re.DOTALL)
content = re.sub(r"\*(.+?)\*", r"\1", content, flags=re.DOTALL)
content = re.sub(r"__(.+?)__", r"\1", content, flags=re.DOTALL)
content = re.sub(r"_(.+?)_", r"\1", content, flags=re.DOTALL)
content = re.sub(r"```[a-z]*\n?", "", content)
content = re.sub(r"`(.+?)`", r"\1", content)
content = re.sub(r"^#{1,6}\s+", "", content, flags=re.MULTILINE)
content = re.sub(r"\[([^\]]+)\]\([^\)]+\)", r"\1", content)
content = re.sub(r"\n{3,}", "\n\n", content)
return content.strip()
# ------------------------------------------------------------------
# Twilio webhook handler
# ------------------------------------------------------------------
async def _handle_webhook(self, request) -> "aiohttp.web.Response":
from aiohttp import web
try:
raw = await request.read()
# Twilio sends form-encoded data, not JSON
form = urllib.parse.parse_qs(raw.decode("utf-8"))
except Exception as e:
logger.error("[sms] webhook parse error: %s", e)
return web.Response(
text='<?xml version="1.0" encoding="UTF-8"?><Response></Response>',
content_type="application/xml",
status=400,
)
# Extract fields (parse_qs returns lists)
from_number = (form.get("From", [""]))[0].strip()
to_number = (form.get("To", [""]))[0].strip()
text = (form.get("Body", [""]))[0].strip()
message_sid = (form.get("MessageSid", [""]))[0].strip()
if not from_number or not text:
return web.Response(
text='<?xml version="1.0" encoding="UTF-8"?><Response></Response>',
content_type="application/xml",
)
# Ignore messages from our own number (echo prevention)
if from_number == self._from_number:
logger.debug("[sms] ignoring echo from own number %s", _redact_phone(from_number))
return web.Response(
text='<?xml version="1.0" encoding="UTF-8"?><Response></Response>',
content_type="application/xml",
)
logger.info(
"[sms] inbound from %s -> %s: %s",
_redact_phone(from_number),
_redact_phone(to_number),
text[:80],
)
source = self.build_source(
chat_id=from_number,
chat_name=from_number,
chat_type="dm",
user_id=from_number,
user_name=from_number,
)
event = MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=source,
raw_message=form,
message_id=message_sid,
)
# Non-blocking: Twilio expects a fast response
asyncio.create_task(self.handle_message(event))
# Return empty TwiML — we send replies via the REST API, not inline TwiML
return web.Response(
text='<?xml version="1.0" encoding="UTF-8"?><Response></Response>',
content_type="application/xml",
)

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,557 @@
"""Generic webhook platform adapter.
Runs an aiohttp HTTP server that receives webhook POSTs from external
services (GitHub, GitLab, JIRA, Stripe, etc.), validates HMAC signatures,
transforms payloads into agent prompts, and routes responses back to the
source or to another configured platform.
Configuration lives in config.yaml under platforms.webhook.extra.routes.
Each route defines:
- events: which event types to accept (header-based filtering)
- secret: HMAC secret for signature validation (REQUIRED)
- prompt: template string formatted with the webhook payload
- skills: optional list of skills to load for the agent
- deliver: where to send the response (github_comment, telegram, etc.)
- deliver_extra: additional delivery config (repo, pr_number, chat_id)
Security:
- HMAC secret is required per route (validated at startup)
- Rate limiting per route (fixed-window, configurable)
- Idempotency cache prevents duplicate agent runs on webhook retries
- Body size limits checked before reading payload
- Set secret to "INSECURE_NO_AUTH" to skip validation (testing only)
"""
import asyncio
import hashlib
import hmac
import json
import logging
import re
import subprocess
import time
from typing import Any, Dict, List, Optional
try:
from aiohttp import web
AIOHTTP_AVAILABLE = True
except ImportError:
AIOHTTP_AVAILABLE = False
web = None # type: ignore[assignment]
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
)
logger = logging.getLogger(__name__)
DEFAULT_HOST = "0.0.0.0"
DEFAULT_PORT = 8644
_INSECURE_NO_AUTH = "INSECURE_NO_AUTH"
def check_webhook_requirements() -> bool:
"""Check if webhook adapter dependencies are available."""
return AIOHTTP_AVAILABLE
class WebhookAdapter(BasePlatformAdapter):
"""Generic webhook receiver that triggers agent runs from HTTP POSTs."""
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.WEBHOOK)
self._host: str = config.extra.get("host", DEFAULT_HOST)
self._port: int = int(config.extra.get("port", DEFAULT_PORT))
self._global_secret: str = config.extra.get("secret", "")
self._routes: Dict[str, dict] = config.extra.get("routes", {})
self._runner = None
# Delivery info keyed by session chat_id — consumed by send()
self._delivery_info: Dict[str, dict] = {}
# Reference to gateway runner for cross-platform delivery (set externally)
self.gateway_runner = None
# Idempotency: TTL cache of recently processed delivery IDs.
# Prevents duplicate agent runs when webhook providers retry.
self._seen_deliveries: Dict[str, float] = {}
self._idempotency_ttl: int = 3600 # 1 hour
# Rate limiting: per-route timestamps in a fixed window.
self._rate_counts: Dict[str, List[float]] = {}
self._rate_limit: int = int(config.extra.get("rate_limit", 30)) # per minute
# Body size limit (auth-before-body pattern)
self._max_body_bytes: int = int(
config.extra.get("max_body_bytes", 1_048_576)
) # 1MB
# ------------------------------------------------------------------
# Lifecycle
# ------------------------------------------------------------------
async def connect(self) -> bool:
# Validate routes at startup — secret is required per route
for name, route in self._routes.items():
secret = route.get("secret", self._global_secret)
if not secret:
raise ValueError(
f"[webhook] Route '{name}' has no HMAC secret. "
f"Set 'secret' on the route or globally. "
f"For testing without auth, set secret to '{_INSECURE_NO_AUTH}'."
)
app = web.Application()
app.router.add_get("/health", self._handle_health)
app.router.add_post("/webhooks/{route_name}", self._handle_webhook)
self._runner = web.AppRunner(app)
await self._runner.setup()
site = web.TCPSite(self._runner, self._host, self._port)
await site.start()
self._mark_connected()
route_names = ", ".join(self._routes.keys()) or "(none configured)"
logger.info(
"[webhook] Listening on %s:%d — routes: %s",
self._host,
self._port,
route_names,
)
return True
async def disconnect(self) -> None:
if self._runner:
await self._runner.cleanup()
self._runner = None
self._mark_disconnected()
logger.info("[webhook] Disconnected")
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Deliver the agent's response to the configured destination.
chat_id is ``webhook:{route}:{delivery_id}`` we pop the delivery
info stored during webhook receipt so it doesn't leak memory.
"""
delivery = self._delivery_info.pop(chat_id, {})
deliver_type = delivery.get("deliver", "log")
if deliver_type == "log":
logger.info("[webhook] Response for %s: %s", chat_id, content[:200])
return SendResult(success=True)
if deliver_type == "github_comment":
return await self._deliver_github_comment(content, delivery)
# Cross-platform delivery (telegram, discord, etc.)
if self.gateway_runner and deliver_type in (
"telegram",
"discord",
"slack",
"signal",
"sms",
):
return await self._deliver_cross_platform(
deliver_type, content, delivery
)
logger.warning("[webhook] Unknown deliver type: %s", deliver_type)
return SendResult(
success=False, error=f"Unknown deliver type: {deliver_type}"
)
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
return {"name": chat_id, "type": "webhook"}
# ------------------------------------------------------------------
# HTTP handlers
# ------------------------------------------------------------------
async def _handle_health(self, request: "web.Request") -> "web.Response":
"""GET /health — simple health check."""
return web.json_response({"status": "ok", "platform": "webhook"})
async def _handle_webhook(self, request: "web.Request") -> "web.Response":
"""POST /webhooks/{route_name} — receive and process a webhook event."""
route_name = request.match_info.get("route_name", "")
route_config = self._routes.get(route_name)
if not route_config:
return web.json_response(
{"error": f"Unknown route: {route_name}"}, status=404
)
# ── Auth-before-body ─────────────────────────────────────
# Check Content-Length before reading the full payload.
content_length = request.content_length or 0
if content_length > self._max_body_bytes:
return web.json_response(
{"error": "Payload too large"}, status=413
)
# ── Rate limiting ────────────────────────────────────────
now = time.time()
window = self._rate_counts.setdefault(route_name, [])
window[:] = [t for t in window if now - t < 60]
if len(window) >= self._rate_limit:
return web.json_response(
{"error": "Rate limit exceeded"}, status=429
)
window.append(now)
# Read body
try:
raw_body = await request.read()
except Exception as e:
logger.error("[webhook] Failed to read body: %s", e)
return web.json_response({"error": "Bad request"}, status=400)
# Validate HMAC signature (skip for INSECURE_NO_AUTH testing mode)
secret = route_config.get("secret", self._global_secret)
if secret and secret != _INSECURE_NO_AUTH:
if not self._validate_signature(request, raw_body, secret):
logger.warning(
"[webhook] Invalid signature for route %s", route_name
)
return web.json_response(
{"error": "Invalid signature"}, status=401
)
# Parse payload
try:
payload = json.loads(raw_body)
except json.JSONDecodeError:
# Try form-encoded as fallback
try:
import urllib.parse
payload = dict(
urllib.parse.parse_qsl(raw_body.decode("utf-8"))
)
except Exception:
return web.json_response(
{"error": "Cannot parse body"}, status=400
)
# Check event type filter
event_type = (
request.headers.get("X-GitHub-Event", "")
or request.headers.get("X-GitLab-Event", "")
or payload.get("event_type", "")
or "unknown"
)
allowed_events = route_config.get("events", [])
if allowed_events and event_type not in allowed_events:
logger.debug(
"[webhook] Ignoring event %s for route %s (allowed: %s)",
event_type,
route_name,
allowed_events,
)
return web.json_response(
{"status": "ignored", "event": event_type}
)
# Format prompt from template
prompt_template = route_config.get("prompt", "")
prompt = self._render_prompt(
prompt_template, payload, event_type, route_name
)
# Inject skill content if configured.
# We call build_skill_invocation_message() directly rather than
# using /skill-name slash commands — the gateway's command parser
# would intercept those and break the flow.
skills = route_config.get("skills", [])
if skills:
try:
from agent.skill_commands import (
build_skill_invocation_message,
get_skill_commands,
)
skill_cmds = get_skill_commands()
for skill_name in skills:
cmd_key = f"/{skill_name}"
if cmd_key in skill_cmds:
skill_content = build_skill_invocation_message(
cmd_key, user_instruction=prompt
)
if skill_content:
prompt = skill_content
break # Load the first matching skill
else:
logger.warning(
"[webhook] Skill '%s' not found", skill_name
)
except Exception as e:
logger.warning("[webhook] Skill loading failed: %s", e)
# Build a unique delivery ID
delivery_id = request.headers.get(
"X-GitHub-Delivery",
request.headers.get("X-Request-ID", str(int(time.time() * 1000))),
)
# ── Idempotency ─────────────────────────────────────────
# Skip duplicate deliveries (webhook retries).
now = time.time()
# Prune expired entries
self._seen_deliveries = {
k: v
for k, v in self._seen_deliveries.items()
if now - v < self._idempotency_ttl
}
if delivery_id in self._seen_deliveries:
logger.info(
"[webhook] Skipping duplicate delivery %s", delivery_id
)
return web.json_response(
{"status": "duplicate", "delivery_id": delivery_id},
status=200,
)
self._seen_deliveries[delivery_id] = now
# Use delivery_id in session key so concurrent webhooks on the
# same route get independent agent runs (not queued/interrupted).
session_chat_id = f"webhook:{route_name}:{delivery_id}"
# Store delivery info for send() — consumed (popped) on delivery
deliver_config = {
"deliver": route_config.get("deliver", "log"),
"deliver_extra": self._render_delivery_extra(
route_config.get("deliver_extra", {}), payload
),
"payload": payload,
}
self._delivery_info[session_chat_id] = deliver_config
# Build source and event
source = self.build_source(
chat_id=session_chat_id,
chat_name=f"webhook/{route_name}",
chat_type="webhook",
user_id=f"webhook:{route_name}",
user_name=route_name,
)
event = MessageEvent(
text=prompt,
message_type=MessageType.TEXT,
source=source,
raw_message=payload,
message_id=delivery_id,
)
logger.info(
"[webhook] %s event=%s route=%s prompt_len=%d delivery=%s",
request.method,
event_type,
route_name,
len(prompt),
delivery_id,
)
# Non-blocking — return 202 Accepted immediately
asyncio.create_task(self.handle_message(event))
return web.json_response(
{
"status": "accepted",
"route": route_name,
"event": event_type,
"delivery_id": delivery_id,
},
status=202,
)
# ------------------------------------------------------------------
# Signature validation
# ------------------------------------------------------------------
def _validate_signature(
self, request: "web.Request", body: bytes, secret: str
) -> bool:
"""Validate webhook signature (GitHub, GitLab, generic HMAC-SHA256)."""
# GitHub: X-Hub-Signature-256 = sha256=<hex>
gh_sig = request.headers.get("X-Hub-Signature-256", "")
if gh_sig:
expected = "sha256=" + hmac.new(
secret.encode(), body, hashlib.sha256
).hexdigest()
return hmac.compare_digest(gh_sig, expected)
# GitLab: X-Gitlab-Token = <plain secret>
gl_token = request.headers.get("X-Gitlab-Token", "")
if gl_token:
return hmac.compare_digest(gl_token, secret)
# Generic: X-Webhook-Signature = <hex HMAC-SHA256>
generic_sig = request.headers.get("X-Webhook-Signature", "")
if generic_sig:
expected = hmac.new(
secret.encode(), body, hashlib.sha256
).hexdigest()
return hmac.compare_digest(generic_sig, expected)
# No recognised signature header but secret is configured → reject
logger.debug(
"[webhook] Secret configured but no signature header found"
)
return False
# ------------------------------------------------------------------
# Prompt rendering
# ------------------------------------------------------------------
def _render_prompt(
self,
template: str,
payload: dict,
event_type: str,
route_name: str,
) -> str:
"""Render a prompt template with the webhook payload.
Supports dot-notation access into nested dicts:
``{pull_request.title}`` ``payload["pull_request"]["title"]``
"""
if not template:
truncated = json.dumps(payload, indent=2)[:4000]
return (
f"Webhook event '{event_type}' on route "
f"'{route_name}':\n\n```json\n{truncated}\n```"
)
def _resolve(match: re.Match) -> str:
key = match.group(1)
value: Any = payload
for part in key.split("."):
if isinstance(value, dict):
value = value.get(part, f"{{{key}}}")
else:
return f"{{{key}}}"
if isinstance(value, (dict, list)):
return json.dumps(value, indent=2)[:2000]
return str(value)
return re.sub(r"\{([a-zA-Z0-9_.]+)\}", _resolve, template)
def _render_delivery_extra(
self, extra: dict, payload: dict
) -> dict:
"""Render delivery_extra template values with payload data."""
rendered: Dict[str, Any] = {}
for key, value in extra.items():
if isinstance(value, str):
rendered[key] = self._render_prompt(value, payload, "", "")
else:
rendered[key] = value
return rendered
# ------------------------------------------------------------------
# Response delivery
# ------------------------------------------------------------------
async def _deliver_github_comment(
self, content: str, delivery: dict
) -> SendResult:
"""Post agent response as a GitHub PR/issue comment via ``gh`` CLI."""
extra = delivery.get("deliver_extra", {})
repo = extra.get("repo", "")
pr_number = extra.get("pr_number", "")
if not repo or not pr_number:
logger.error(
"[webhook] github_comment delivery missing repo or pr_number"
)
return SendResult(
success=False, error="Missing repo or pr_number"
)
try:
result = subprocess.run(
[
"gh",
"pr",
"comment",
str(pr_number),
"--repo",
repo,
"--body",
content,
],
capture_output=True,
text=True,
timeout=30,
)
if result.returncode == 0:
logger.info(
"[webhook] Posted comment on %s#%s", repo, pr_number
)
return SendResult(success=True)
else:
logger.error(
"[webhook] gh pr comment failed: %s", result.stderr
)
return SendResult(success=False, error=result.stderr)
except FileNotFoundError:
logger.error(
"[webhook] 'gh' CLI not found — install GitHub CLI for "
"github_comment delivery"
)
return SendResult(
success=False, error="gh CLI not installed"
)
except Exception as e:
logger.error("[webhook] github_comment delivery error: %s", e)
return SendResult(success=False, error=str(e))
async def _deliver_cross_platform(
self, platform_name: str, content: str, delivery: dict
) -> SendResult:
"""Route response to another platform (telegram, discord, etc.)."""
if not self.gateway_runner:
return SendResult(
success=False,
error="No gateway runner for cross-platform delivery",
)
try:
target_platform = Platform(platform_name)
except ValueError:
return SendResult(
success=False, error=f"Unknown platform: {platform_name}"
)
adapter = self.gateway_runner.adapters.get(target_platform)
if not adapter:
return SendResult(
success=False,
error=f"Platform {platform_name} not connected",
)
# Use home channel if no specific chat_id in deliver_extra
extra = delivery.get("deliver_extra", {})
chat_id = extra.get("chat_id", "")
if not chat_id:
home = self.gateway_runner.config.get_home_channel(target_platform)
if home:
chat_id = home.chat_id
else:
return SendResult(
success=False,
error=f"No chat_id or home channel for {platform_name}",
)
return await adapter.send(chat_id, content)

View file

@ -0,0 +1,714 @@
"""
WhatsApp platform adapter.
WhatsApp integration is more complex than Telegram/Discord because:
- No official bot API for personal accounts
- Business API requires Meta Business verification
- Most solutions use web-based automation
This adapter supports multiple backends:
1. WhatsApp Business API (requires Meta verification)
2. whatsapp-web.js (via Node.js subprocess) - for personal accounts
3. Baileys (via Node.js subprocess) - alternative for personal accounts
For simplicity, we'll implement a generic interface that can work
with different backends via a bridge pattern.
"""
import asyncio
import json
import logging
import os
import platform
import subprocess
_IS_WINDOWS = platform.system() == "Windows"
from pathlib import Path
from typing import Dict, List, Optional, Any
from hermes_cli.config import get_hermes_home
logger = logging.getLogger(__name__)
def _kill_port_process(port: int) -> None:
"""Kill any process listening on the given TCP port."""
try:
if _IS_WINDOWS:
# Use netstat to find the PID bound to this port, then taskkill
result = subprocess.run(
["netstat", "-ano", "-p", "TCP"],
capture_output=True, text=True, timeout=5,
)
for line in result.stdout.splitlines():
parts = line.split()
if len(parts) >= 5 and parts[3] == "LISTENING":
local_addr = parts[1]
if local_addr.endswith(f":{port}"):
try:
subprocess.run(
["taskkill", "/PID", parts[4], "/F"],
capture_output=True, timeout=5,
)
except subprocess.SubprocessError:
pass
else:
result = subprocess.run(
["fuser", f"{port}/tcp"],
capture_output=True, timeout=5,
)
if result.returncode == 0:
subprocess.run(
["fuser", "-k", f"{port}/tcp"],
capture_output=True, timeout=5,
)
except Exception:
pass
import sys
sys.path.insert(0, str(Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
cache_image_from_url,
cache_audio_from_url,
)
def check_whatsapp_requirements() -> bool:
"""
Check if WhatsApp dependencies are available.
WhatsApp requires a Node.js bridge for most implementations.
"""
# Check for Node.js
try:
result = subprocess.run(
["node", "--version"],
capture_output=True,
text=True,
timeout=5
)
return result.returncode == 0
except Exception:
return False
class WhatsAppAdapter(BasePlatformAdapter):
"""
WhatsApp adapter.
This implementation uses a simple HTTP bridge pattern where:
1. A Node.js process runs the WhatsApp Web client
2. Messages are forwarded via HTTP/IPC to this Python adapter
3. Responses are sent back through the bridge
The actual Node.js bridge implementation can vary:
- whatsapp-web.js based
- Baileys based
- Business API based
Configuration:
- bridge_script: Path to the Node.js bridge script
- bridge_port: Port for HTTP communication (default: 3000)
- session_path: Path to store WhatsApp session data
"""
# WhatsApp message limits
MAX_MESSAGE_LENGTH = 65536 # WhatsApp allows longer messages
# Default bridge location relative to the hermes-agent install
_DEFAULT_BRIDGE_DIR = Path(__file__).resolve().parents[2] / "scripts" / "whatsapp-bridge"
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.WHATSAPP)
self._bridge_process: Optional[subprocess.Popen] = None
self._bridge_port: int = config.extra.get("bridge_port", 3000)
self._bridge_script: Optional[str] = config.extra.get(
"bridge_script",
str(self._DEFAULT_BRIDGE_DIR / "bridge.js"),
)
self._session_path: Path = Path(config.extra.get(
"session_path",
get_hermes_home() / "whatsapp" / "session"
))
self._reply_prefix: Optional[str] = config.extra.get("reply_prefix")
self._message_queue: asyncio.Queue = asyncio.Queue()
self._bridge_log_fh = None
self._bridge_log: Optional[Path] = None
async def connect(self) -> bool:
"""
Start the WhatsApp bridge.
This launches the Node.js bridge process and waits for it to be ready.
"""
if not check_whatsapp_requirements():
logger.warning("[%s] Node.js not found. WhatsApp requires Node.js.", self.name)
return False
bridge_path = Path(self._bridge_script)
if not bridge_path.exists():
logger.warning("[%s] Bridge script not found: %s", self.name, bridge_path)
return False
logger.info("[%s] Bridge found at %s", self.name, bridge_path)
# Auto-install npm dependencies if node_modules doesn't exist
bridge_dir = bridge_path.parent
if not (bridge_dir / "node_modules").exists():
print(f"[{self.name}] Installing WhatsApp bridge dependencies...")
try:
install_result = subprocess.run(
["npm", "install", "--silent"],
cwd=str(bridge_dir),
capture_output=True,
text=True,
timeout=60,
)
if install_result.returncode != 0:
print(f"[{self.name}] npm install failed: {install_result.stderr}")
return False
print(f"[{self.name}] Dependencies installed")
except Exception as e:
print(f"[{self.name}] Failed to install dependencies: {e}")
return False
try:
# Ensure session directory exists
self._session_path.mkdir(parents=True, exist_ok=True)
# Check if bridge is already running and connected
import aiohttp
import asyncio
try:
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://127.0.0.1:{self._bridge_port}/health",
timeout=aiohttp.ClientTimeout(total=2)
) as resp:
if resp.status == 200:
data = await resp.json()
bridge_status = data.get("status", "unknown")
if bridge_status == "connected":
print(f"[{self.name}] Using existing bridge (status: {bridge_status})")
self._mark_connected()
self._bridge_process = None # Not managed by us
asyncio.create_task(self._poll_messages())
return True
else:
print(f"[{self.name}] Bridge found but not connected (status: {bridge_status}), restarting")
except Exception:
pass # Bridge not running, start a new one
# Kill any orphaned bridge from a previous gateway run
_kill_port_process(self._bridge_port)
await asyncio.sleep(1)
# Start the bridge process in its own process group.
# Route output to a log file so QR codes, errors, and reconnection
# messages are preserved for troubleshooting.
whatsapp_mode = os.getenv("WHATSAPP_MODE", "self-chat")
self._bridge_log = self._session_path.parent / "bridge.log"
bridge_log_fh = open(self._bridge_log, "a")
self._bridge_log_fh = bridge_log_fh
# Build bridge subprocess environment.
# Pass WHATSAPP_REPLY_PREFIX from config.yaml so the Node bridge
# can use it without the user needing to set a separate env var.
bridge_env = os.environ.copy()
if self._reply_prefix is not None:
bridge_env["WHATSAPP_REPLY_PREFIX"] = self._reply_prefix
self._bridge_process = subprocess.Popen(
[
"node",
str(bridge_path),
"--port", str(self._bridge_port),
"--session", str(self._session_path),
"--mode", whatsapp_mode,
],
stdout=bridge_log_fh,
stderr=bridge_log_fh,
preexec_fn=None if _IS_WINDOWS else os.setsid,
env=bridge_env,
)
# Wait for the bridge to connect to WhatsApp.
# Phase 1: wait for the HTTP server to come up (up to 15s).
# Phase 2: wait for WhatsApp status: connected (up to 15s more).
import aiohttp
http_ready = False
data = {}
for attempt in range(15):
await asyncio.sleep(1)
if self._bridge_process.poll() is not None:
print(f"[{self.name}] Bridge process died (exit code {self._bridge_process.returncode})")
print(f"[{self.name}] Check log: {self._bridge_log}")
self._close_bridge_log()
return False
try:
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://127.0.0.1:{self._bridge_port}/health",
timeout=aiohttp.ClientTimeout(total=2)
) as resp:
if resp.status == 200:
http_ready = True
data = await resp.json()
if data.get("status") == "connected":
print(f"[{self.name}] Bridge ready (status: connected)")
break
except Exception:
continue
if not http_ready:
print(f"[{self.name}] Bridge HTTP server did not start in 15s")
print(f"[{self.name}] Check log: {self._bridge_log}")
self._close_bridge_log()
return False
# Phase 2: HTTP is up but WhatsApp may still be connecting.
# Give it more time to authenticate with saved credentials.
if data.get("status") != "connected":
print(f"[{self.name}] Bridge HTTP ready, waiting for WhatsApp connection...")
for attempt in range(15):
await asyncio.sleep(1)
if self._bridge_process.poll() is not None:
print(f"[{self.name}] Bridge process died during connection")
print(f"[{self.name}] Check log: {self._bridge_log}")
self._close_bridge_log()
return False
try:
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://127.0.0.1:{self._bridge_port}/health",
timeout=aiohttp.ClientTimeout(total=2)
) as resp:
if resp.status == 200:
data = await resp.json()
if data.get("status") == "connected":
print(f"[{self.name}] Bridge ready (status: connected)")
break
except Exception:
continue
else:
# Still not connected — warn but proceed (bridge may
# auto-reconnect later, e.g. after a code 515 restart).
print(f"[{self.name}] ⚠ WhatsApp not connected after 30s")
print(f"[{self.name}] Bridge log: {self._bridge_log}")
print(f"[{self.name}] If session expired, re-pair: hermes whatsapp")
# Start message polling task
asyncio.create_task(self._poll_messages())
self._mark_connected()
print(f"[{self.name}] Bridge started on port {self._bridge_port}")
return True
except Exception as e:
logger.error("[%s] Failed to start bridge: %s", self.name, e, exc_info=True)
self._close_bridge_log()
return False
def _close_bridge_log(self) -> None:
"""Close the bridge log file handle if open."""
if self._bridge_log_fh:
try:
self._bridge_log_fh.close()
except Exception:
pass
self._bridge_log_fh = None
async def _check_managed_bridge_exit(self) -> Optional[str]:
"""Return a fatal error message if the managed bridge child exited."""
if self._bridge_process is None:
return None
returncode = self._bridge_process.poll()
if returncode is None:
return None
message = f"WhatsApp bridge process exited unexpectedly (code {returncode})."
if not self.has_fatal_error:
logger.error("[%s] %s", self.name, message)
self._set_fatal_error("whatsapp_bridge_exited", message, retryable=True)
self._close_bridge_log()
await self._notify_fatal_error()
return self.fatal_error_message or message
async def disconnect(self) -> None:
"""Stop the WhatsApp bridge and clean up any orphaned processes."""
if self._bridge_process:
try:
# Kill the entire process group so child node processes die too
import signal
try:
if _IS_WINDOWS:
self._bridge_process.terminate()
else:
os.killpg(os.getpgid(self._bridge_process.pid), signal.SIGTERM)
except (ProcessLookupError, PermissionError):
self._bridge_process.terminate()
await asyncio.sleep(1)
if self._bridge_process.poll() is None:
try:
if _IS_WINDOWS:
self._bridge_process.kill()
else:
os.killpg(os.getpgid(self._bridge_process.pid), signal.SIGKILL)
except (ProcessLookupError, PermissionError):
self._bridge_process.kill()
except Exception as e:
print(f"[{self.name}] Error stopping bridge: {e}")
else:
# Bridge was not started by us, don't kill it
print(f"[{self.name}] Disconnecting (external bridge left running)")
self._mark_disconnected()
self._bridge_process = None
self._close_bridge_log()
print(f"[{self.name}] Disconnected")
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None
) -> SendResult:
"""Send a message via the WhatsApp bridge."""
if not self._running:
return SendResult(success=False, error="Not connected")
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
return SendResult(success=False, error=bridge_exit)
try:
import aiohttp
async with aiohttp.ClientSession() as session:
payload = {
"chatId": chat_id,
"message": content,
}
if reply_to:
payload["replyTo"] = reply_to
async with session.post(
f"http://127.0.0.1:{self._bridge_port}/send",
json=payload,
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except ImportError:
return SendResult(
success=False,
error="aiohttp not installed. Run: pip install aiohttp"
)
except Exception as e:
return SendResult(success=False, error=str(e))
async def edit_message(
self,
chat_id: str,
message_id: str,
content: str,
) -> SendResult:
"""Edit a previously sent message via the WhatsApp bridge."""
if not self._running:
return SendResult(success=False, error="Not connected")
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
return SendResult(success=False, error=bridge_exit)
try:
import aiohttp
async with aiohttp.ClientSession() as session:
async with session.post(
f"http://127.0.0.1:{self._bridge_port}/edit",
json={
"chatId": chat_id,
"messageId": message_id,
"message": content,
},
timeout=aiohttp.ClientTimeout(total=15)
) as resp:
if resp.status == 200:
return SendResult(success=True, message_id=message_id)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except Exception as e:
return SendResult(success=False, error=str(e))
async def _send_media_to_bridge(
self,
chat_id: str,
file_path: str,
media_type: str,
caption: Optional[str] = None,
file_name: Optional[str] = None,
) -> SendResult:
"""Send any media file via bridge /send-media endpoint."""
if not self._running:
return SendResult(success=False, error="Not connected")
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
return SendResult(success=False, error=bridge_exit)
try:
import aiohttp
if not os.path.exists(file_path):
return SendResult(success=False, error=f"File not found: {file_path}")
payload: Dict[str, Any] = {
"chatId": chat_id,
"filePath": file_path,
"mediaType": media_type,
}
if caption:
payload["caption"] = caption
if file_name:
payload["fileName"] = file_name
async with aiohttp.ClientSession() as session:
async with session.post(
f"http://127.0.0.1:{self._bridge_port}/send-media",
json=payload,
timeout=aiohttp.ClientTimeout(total=120),
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data,
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except Exception as e:
return SendResult(success=False, error=str(e))
async def send_image(
self,
chat_id: str,
image_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Download image URL to cache, send natively via bridge."""
try:
local_path = await cache_image_from_url(image_url)
return await self._send_media_to_bridge(chat_id, local_path, "image", caption)
except Exception:
return await super().send_image(chat_id, image_url, caption, reply_to)
async def send_image_file(
self,
chat_id: str,
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a local image file natively via bridge."""
return await self._send_media_to_bridge(chat_id, image_path, "image", caption)
async def send_video(
self,
chat_id: str,
video_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a video natively via bridge — plays inline in WhatsApp."""
return await self._send_media_to_bridge(chat_id, video_path, "video", caption)
async def send_document(
self,
chat_id: str,
file_path: str,
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a document/file as a downloadable attachment via bridge."""
return await self._send_media_to_bridge(
chat_id, file_path, "document", caption,
file_name or os.path.basename(file_path),
)
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Send typing indicator via bridge."""
if not self._running:
return
if await self._check_managed_bridge_exit():
return
try:
import aiohttp
async with aiohttp.ClientSession() as session:
await session.post(
f"http://127.0.0.1:{self._bridge_port}/typing",
json={"chatId": chat_id},
timeout=aiohttp.ClientTimeout(total=5)
)
except Exception:
pass # Ignore typing indicator failures
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a WhatsApp chat."""
if not self._running:
return {"name": "Unknown", "type": "dm"}
if await self._check_managed_bridge_exit():
return {"name": chat_id, "type": "dm"}
try:
import aiohttp
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://127.0.0.1:{self._bridge_port}/chat/{chat_id}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
data = await resp.json()
return {
"name": data.get("name", chat_id),
"type": "group" if data.get("isGroup") else "dm",
"participants": data.get("participants", []),
}
except Exception as e:
logger.debug("Could not get WhatsApp chat info for %s: %s", chat_id, e)
return {"name": chat_id, "type": "dm"}
async def _poll_messages(self) -> None:
"""Poll the bridge for incoming messages."""
try:
import aiohttp
except ImportError:
print(f"[{self.name}] aiohttp not installed, message polling disabled")
return
while self._running:
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
print(f"[{self.name}] {bridge_exit}")
break
try:
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://127.0.0.1:{self._bridge_port}/messages",
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
messages = await resp.json()
for msg_data in messages:
event = await self._build_message_event(msg_data)
if event:
await self.handle_message(event)
except asyncio.CancelledError:
break
except Exception as e:
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
print(f"[{self.name}] {bridge_exit}")
break
print(f"[{self.name}] Poll error: {e}")
await asyncio.sleep(5)
await asyncio.sleep(1) # Poll interval
async def _build_message_event(self, data: Dict[str, Any]) -> Optional[MessageEvent]:
"""Build a MessageEvent from bridge message data, downloading images to cache."""
try:
# Determine message type
msg_type = MessageType.TEXT
if data.get("hasMedia"):
media_type = data.get("mediaType", "")
if "image" in media_type:
msg_type = MessageType.PHOTO
elif "video" in media_type:
msg_type = MessageType.VIDEO
elif "audio" in media_type or "ptt" in media_type: # ptt = voice note
msg_type = MessageType.VOICE
else:
msg_type = MessageType.DOCUMENT
# Determine chat type
is_group = data.get("isGroup", False)
chat_type = "group" if is_group else "dm"
# Build source
source = self.build_source(
chat_id=data.get("chatId", ""),
chat_name=data.get("chatName"),
chat_type=chat_type,
user_id=data.get("senderId"),
user_name=data.get("senderName"),
)
# Download image media URLs to the local cache so the vision tool
# can access them reliably regardless of URL expiration.
raw_urls = data.get("mediaUrls", [])
cached_urls = []
media_types = []
for url in raw_urls:
if msg_type == MessageType.PHOTO and url.startswith(("http://", "https://")):
try:
cached_path = await cache_image_from_url(url, ext=".jpg")
cached_urls.append(cached_path)
media_types.append("image/jpeg")
print(f"[{self.name}] Cached user image: {cached_path}", flush=True)
except Exception as e:
print(f"[{self.name}] Failed to cache image: {e}", flush=True)
cached_urls.append(url)
media_types.append("image/jpeg")
elif msg_type == MessageType.PHOTO and os.path.isabs(url):
# Local file path — bridge already downloaded the image
cached_urls.append(url)
media_types.append("image/jpeg")
print(f"[{self.name}] Using bridge-cached image: {url}", flush=True)
elif msg_type == MessageType.VOICE and url.startswith(("http://", "https://")):
try:
cached_path = await cache_audio_from_url(url, ext=".ogg")
cached_urls.append(cached_path)
media_types.append("audio/ogg")
print(f"[{self.name}] Cached user voice: {cached_path}", flush=True)
except Exception as e:
print(f"[{self.name}] Failed to cache voice: {e}", flush=True)
cached_urls.append(url)
media_types.append("audio/ogg")
else:
cached_urls.append(url)
media_types.append("unknown")
return MessageEvent(
text=data.get("body", ""),
message_type=msg_type,
source=source,
raw_message=data,
message_id=data.get("messageId"),
media_urls=cached_urls,
media_types=media_types,
)
except Exception as e:
print(f"[{self.name}] Error building event: {e}")
return None