feat(matrix): land QA follow-ups and refresh docs

- harden Matrix onboarding/chat lifecycle after manual QA
- refresh README and Matrix docs to match current behavior
- add local ignores for runtime artifacts and include current planning/report docs

Closes #7
Closes #9
Closes #14
This commit is contained in:
Mikhail Putilovskij 2026-04-05 19:08:58 +03:00
parent 7fce4c9b3e
commit 6ced154124
35 changed files with 8380 additions and 67 deletions

75
bot-examples/README.md Normal file
View file

@ -0,0 +1,75 @@
# Reference Examples for Bot Development
Sanitized code examples from the agent-core project for building
Telegram and Matrix bots that integrate with LLM backends.
## Files
### Telegram Bot with Forum Topics
**`telegram_bot_topics.py`** — Complete Telegram bot using python-telegram-bot 22+.
Key patterns:
- **Forum topics**: Create/rename topics, route messages by `message_thread_id`
- **Message types**: Text, photos, voice/audio, documents — each with its own handler
- **Streaming responses**: Progressive message editing as LLM generates text
- **Outbox pattern**: LLM writes to `outbox.jsonl`, bot sends files after response
- **Topic naming**: LLM generates topic labels, bot auto-renames forum topics
- **Voice transcription**: Download voice → external STT → send text to LLM
- **Proxy support**: SOCKS5 proxy with retry logic for unreliable connections
Dependencies: `python-telegram-bot>=22.0`, `httpx`, `pyyaml`
### Matrix Bot with Room Management
**`matrix_bot_rooms.py`** — Matrix bot using matrix-nio with E2E encryption.
Key patterns:
- **Room creation**: Create private encrypted rooms, invite users, set avatars
- **Room modes**: Per-room behavior (quiet/context/full) stored in config.json
- **Multi-user**: Users map with per-user profiles loaded from YAML
- **E2E encryption**: Crypto store, key upload, cross-signing, device verification
- **Media handling**: Download + decrypt encrypted media (images, voice, files)
- **Message queuing**: Persistent queue (queue.jsonl) for messages arriving while busy
- **Status threads**: Post tool progress as thread replies under user's message
- **Session management**: Per-room Claude sessions with idle timeout, cancel support
- **Room naming**: Auto-generate room names from conversation content via local LLM
- **Bot commands**: `!new`, `!mode`, `!status`, `!security`, `!help`
- **Security modes**: strict/guarded/open for E2E device verification policy
- **Typing indicators**: Show typing while LLM processes
Dependencies: `matrix-nio[e2e]>=0.24`, `httpx`, `markdown`, `pyyaml`
### Shared: LLM Session Manager
**`llm_session.py`** — Process manager for Claude Code CLI (adaptable to any LLM).
Key patterns:
- **Session persistence**: Save/restore session IDs for conversation continuity
- **Stream parsing**: Parse `stream-json` output for real-time tool/status tracking
- **Idle timeout**: Watchdog task resets on output, kills on silence
- **Cancel support**: External event to kill LLM process mid-turn
- **Fallback chain**: Primary LLM fails → try secondary provider
- **Sandbox**: bubblewrap (bwrap) wrapper for filesystem isolation
- **Status callbacks**: Emit events for tool_start, tool_end, thinking text
- **Environment isolation**: Strip sensitive env vars before spawning subprocess
### Shared: Config
**`config_example.py`** — Simple dataclass config loaded from environment variables.
## Architecture
```
User ──► Bot (Telegram/Matrix) ──► LLM Session Manager ──► Claude CLI (sandboxed)
│ │
├── media download ├── session persistence
├── typing indicators ├── stream parsing
├── outbox file sending ├── timeout watchdog
└── topic/room management └── fallback provider
```
The bot and LLM session are decoupled — the session manager doesn't know
about Telegram or Matrix. It takes a message string, runs the CLI process,
and returns text + status callbacks. The bot handles all platform-specific
concerns (formatting, media, rooms/topics).

233
bot-examples/asr.py Normal file
View file

@ -0,0 +1,233 @@
"""ASR via OpenAI-compatible STT server (GigaAM, Whisper, etc).
Default: GigaAM (Russian-optimized, handles long-form natively via pyannote).
Fallback: Whisper (multilingual, needs client-side chunking for long audio).
Truncation detection and chunked retry only applies to Whisper-based backends.
GigaAM handles long-form audio server-side via pyannote segmentation.
"""
import asyncio
import logging
import os
import re
import tempfile
from pathlib import Path
import httpx
logger = logging.getLogger(__name__)
MAX_RETRIES = 3
TIMEOUT = 300.0
# If Whisper covers less than this fraction of the audio, retry with chunks
COVERAGE_THRESHOLD = 0.85
def _is_whisper(stt_url: str) -> bool:
"""Heuristic: URL points to a Whisper-based server."""
return "whisper" in stt_url.lower()
async def _get_duration(audio_path: str) -> float | None:
"""Get audio duration in seconds via ffprobe."""
try:
proc = await asyncio.create_subprocess_exec(
"ffprobe", "-v", "quiet", "-show_entries", "format=duration",
"-of", "default=noprint_wrappers=1:nokey=1", audio_path,
stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.DEVNULL,
)
stdout, _ = await proc.communicate()
return float(stdout.decode().strip())
except Exception:
return None
async def _find_split_points(audio_path: str, target_chunk: float = 30.0) -> list[float]:
"""Find silence gaps for splitting audio into ~target_chunk second pieces."""
try:
proc = await asyncio.create_subprocess_exec(
"ffmpeg", "-i", audio_path,
"-af", "silencedetect=noise=-35dB:d=0.4",
"-f", "null", "-",
stdout=asyncio.subprocess.DEVNULL, stderr=asyncio.subprocess.PIPE,
)
_, stderr = await proc.communicate()
output = stderr.decode("utf-8", errors="replace")
silences = []
for m in re.finditer(r"silence_end:\s*([\d.]+)", output):
silences.append(float(m.group(1)))
if not silences:
return []
duration = await _get_duration(audio_path) or silences[-1] + 10
splits = []
target = target_chunk
while target < duration - 10:
best = min(silences, key=lambda s: abs(s - target))
if not splits or best > splits[-1] + 10:
splits.append(best)
target += target_chunk
return splits
except Exception:
return []
async def _stt_request(
url: str, audio_path: str, language: str | None = None,
response_format: str = "json",
) -> dict:
"""Single STT API call. Returns the JSON response dict."""
last_exc = None
for attempt in range(MAX_RETRIES):
try:
async with httpx.AsyncClient(timeout=TIMEOUT) as client:
with open(audio_path, "rb") as f:
data = {"response_format": response_format}
if _is_whisper(url):
data["model"] = "Systran/faster-whisper-large-v3"
if language:
data["language"] = language
files = {"file": (Path(audio_path).name, f, "application/octet-stream")}
resp = await client.post(url, data=data, files=files)
if resp.status_code != 200:
raise RuntimeError(
f"STT API returned {resp.status_code}: {resp.text[:200]}"
)
return resp.json()
except (httpx.ConnectError, httpx.TimeoutException) as e:
last_exc = e
if attempt < MAX_RETRIES - 1:
logger.warning(
"STT connection error (attempt %d/%d): %s",
attempt + 1, MAX_RETRIES, e,
)
continue
except RuntimeError:
raise
except Exception as e:
raise RuntimeError(f"STT transcription failed: {e}") from e
raise RuntimeError(f"STT unavailable after {MAX_RETRIES} attempts: {last_exc}")
async def _transcribe_chunked(
url: str, audio_path: str, split_points: list[float],
language: str | None = None,
) -> str:
"""Split audio at silence boundaries and transcribe each chunk."""
tmpdir = tempfile.mkdtemp(prefix="asr_chunk_")
chunks = []
try:
boundaries = [0.0] + split_points
for i, start in enumerate(boundaries):
chunk_path = os.path.join(tmpdir, f"chunk{i}.ogg")
args = ["ffmpeg", "-y", "-i", audio_path, "-ss", str(start)]
if i < len(split_points):
args += ["-t", str(split_points[i] - start)]
args += ["-c", "copy", chunk_path]
proc = await asyncio.create_subprocess_exec(
*args,
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.DEVNULL,
)
await proc.wait()
chunks.append(chunk_path)
texts = []
for chunk in chunks:
if not os.path.exists(chunk) or os.path.getsize(chunk) < 100:
continue
result = await _stt_request(url, chunk, language=language)
text = result.get("text", "").strip()
if text:
texts.append(text)
return " ".join(texts)
finally:
for f in chunks:
try:
os.unlink(f)
except OSError:
pass
try:
os.rmdir(tmpdir)
except OSError:
pass
HYBRID_THRESHOLD = 30.0 # seconds — use Whisper for short, GigaAM for long
async def transcribe(
audio_path: str,
stt_url: str,
language: str | None = None,
whisper_url: str | None = None,
) -> tuple[str, str]:
"""Transcribe audio file via OpenAI-compatible STT server.
Hybrid mode: if both stt_url and whisper_url are provided, uses Whisper
for short audio (<30s) and the primary STT for longer audio.
Returns:
(transcribed_text, engine_tag) engine_tag is "w" or "g" (or first letter of host).
Raises:
RuntimeError: If transcription fails after retries.
"""
# Hybrid: pick engine based on duration
chosen_url = stt_url
if whisper_url and whisper_url != stt_url:
duration = await _get_duration(audio_path)
if duration is not None and duration < HYBRID_THRESHOLD:
chosen_url = whisper_url
url = f"{chosen_url.rstrip('/')}/v1/audio/transcriptions"
whisper = _is_whisper(chosen_url)
engine_tag = "w" if whisper else chosen_url.split("//")[-1][0]
# For Whisper: use verbose_json to detect truncation
# For others: simple json is enough
fmt = "verbose_json" if whisper else "json"
result = await _stt_request(url, audio_path, language=language, response_format=fmt)
text = result.get("text", "").strip()
if not text:
raise RuntimeError("STT returned empty transcription")
# Whisper truncation detection — only for Whisper backends
if whisper:
file_duration = await _get_duration(audio_path)
segments = result.get("segments", [])
if file_duration and segments and file_duration > 30:
last_segment_end = segments[-1].get("end", 0)
coverage = last_segment_end / file_duration
if coverage < COVERAGE_THRESHOLD:
logger.warning(
"Whisper truncated %s: covered %.0f/%.0fs (%.0f%%), retrying with chunks",
Path(audio_path).name, last_segment_end, file_duration, coverage * 100,
)
split_points = await _find_split_points(audio_path, target_chunk=30.0)
if not split_points:
n_chunks = max(2, int(file_duration / 30))
split_points = [file_duration * i / n_chunks for i in range(1, n_chunks)]
chunked_text = await _transcribe_chunked(
url, audio_path, split_points, language=language,
)
if len(chunked_text) > len(text):
text = chunked_text
logger.info(
"Chunked transcription recovered %d chars (was %d)",
len(text), len(result.get("text", "")),
)
logger.info("Transcribed %s: %d chars [%s]", Path(audio_path).name, len(text), engine_tag)
return text, engine_tag

29
bot-examples/bwrap-claude Executable file
View file

@ -0,0 +1,29 @@
#!/usr/bin/env bash
# Sandboxed wrapper for Claude Code using bubblewrap.
# Restricts filesystem access: DATA_DIR is writable, system is read-only.
#
# Usage: bwrap-claude <claude-command> [args...]
# bwrap-claude claude -p --verbose ...
# bwrap-claude claude-zai -p --verbose ...
#
# Requires: bubblewrap (apt install bubblewrap)
set -euo pipefail
DATA_DIR="${DATA_DIR:?DATA_DIR must be set}"
exec bwrap \
--ro-bind / / \
--tmpfs /tmp \
--tmpfs /run \
--tmpfs /root \
--proc /proc \
--dev /dev \
--bind "$DATA_DIR" "$DATA_DIR" \
--bind "$HOME/.claude" "$HOME/.claude" \
--bind-try "$HOME/.claude-zai" "$HOME/.claude-zai" \
--setenv HOME "$HOME" \
--setenv DATA_DIR "$DATA_DIR" \
--die-with-parent \
--new-session \
"$@"

View file

@ -0,0 +1,60 @@
"""Load configuration from environment variables."""
import os
from dataclasses import dataclass, field
from pathlib import Path
@dataclass
class Config:
bot_token: str = ""
owner_id: int = 0
data_dir: Path = Path(".")
claude_cmd: str = "claude"
proxy: str | None = None
stt_url: str | None = None
allowed_tools: list[str] = field(default_factory=list)
claude_idle_timeout: int = 120
claude_max_timeout: int = 1800
workspace_dir: Path | None = None
@classmethod
def from_env(cls) -> "Config":
bot_token = os.environ.get("BOT_TOKEN", "")
owner_id_str = os.environ.get("OWNER_ID", "0")
owner_id = int(owner_id_str)
data_dir_str = os.environ.get("DATA_DIR", "")
if not data_dir_str:
raise ValueError("DATA_DIR env var is required")
data_dir = Path(data_dir_str)
claude_cmd = os.environ.get("CLAUDE_CMD", "claude")
proxy = os.environ.get("PROXY") or None
stt_url = os.environ.get("STT_URL") or os.environ.get("WHISPER_URL") or None
default_tools = "Read,Write,Edit,Glob,Grep,Bash,WebSearch,WebFetch,mcp__fetcher,mcp__yandex-search"
allowed_tools_str = os.environ.get("ALLOWED_TOOLS", default_tools)
allowed_tools = [t.strip() for t in allowed_tools_str.split(",") if t.strip()]
idle_timeout_str = os.environ.get("CLAUDE_IDLE_TIMEOUT",
os.environ.get("CLAUDE_TIMEOUT", "120"))
claude_idle_timeout = int(idle_timeout_str)
max_timeout_str = os.environ.get("CLAUDE_MAX_TIMEOUT", "1800")
claude_max_timeout = int(max_timeout_str)
workspace_dir_str = os.environ.get("WORKSPACE_DIR")
workspace_dir = Path(workspace_dir_str) if workspace_dir_str else None
return cls(
bot_token=bot_token,
owner_id=owner_id,
data_dir=data_dir,
claude_cmd=claude_cmd,
proxy=proxy,
stt_url=stt_url,
allowed_tools=allowed_tools,
claude_idle_timeout=claude_idle_timeout,
claude_max_timeout=claude_max_timeout,
workspace_dir=workspace_dir,
)

635
bot-examples/llm_session.py Normal file
View file

@ -0,0 +1,635 @@
"""Claude CLI session manager.
Manages Claude Code CLI sessions per topic. Each topic gets a persistent
session ID so conversation context is maintained across messages.
Uses --output-format stream-json with asyncio subprocess to stream responses.
Falls back to claude-zai if primary claude fails.
Timeout: idle-based (resets on any output from Claude) + hard ceiling.
Status: streams tool_use/agent events via on_status callback.
Cancel: external cancel_event to stop processing.
"""
import asyncio
import json
import logging
import os
import shutil
import time
import uuid
from collections.abc import Callable
from pathlib import Path
from core.config import Config
logger = logging.getLogger(__name__)
def _session_path(data_dir: Path, topic_id: int | str, provider: str = "") -> Path:
"""Path to session ID file for a topic."""
suffix = f"_{provider}" if provider else ""
return data_dir / "topics" / str(topic_id) / f"session{suffix}.txt"
def load_session(data_dir: Path, topic_id: int | str, provider: str = "") -> str | None:
"""Load existing session ID for a topic, or None."""
path = _session_path(data_dir, topic_id, provider)
if path.exists():
return path.read_text().strip()
return None
def save_session(data_dir: Path, topic_id: int | str, session_id: str, provider: str = "") -> None:
"""Save session ID for a topic."""
path = _session_path(data_dir, topic_id, provider)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(session_id)
async def send_message(
config: Config,
topic_id: int | str,
message: str,
on_chunk: Callable | None = None,
on_question: Callable | None = None,
on_status: Callable | None = None,
cancel_event: asyncio.Event | None = None,
idle_timeout_ref: list | None = None,
user_profile: str = "",
workspace_dir: Path | None = None,
) -> str:
"""Send a message to Claude CLI and return the response.
Args:
config: Application config.
topic_id: Topic ID (determines session and working directory).
message: User message text.
on_chunk: Optional async callback(text_so_far) for streaming updates.
on_question: Optional async callback(question) -> answer for ask-user tool.
on_status: Optional async callback(dict) for tool/agent status events.
cancel_event: Optional asyncio.Event set to cancel processing.
idle_timeout_ref: Optional mutable [int] current idle timeout in seconds.
Can be modified externally (e.g. user "more time" command).
user_profile: Optional user profile text (from user.md) to inject into system prompt.
workspace_dir: Optional per-user workspace directory path.
Returns:
Full response text.
Raises:
RuntimeError: If both primary and fallback CLI fail.
"""
# Try primary provider first
try:
return await _send_with_provider(config, topic_id, message, on_chunk, on_question,
on_status=on_status, cancel_event=cancel_event,
idle_timeout_ref=idle_timeout_ref,
provider="", user_profile=user_profile,
workspace_dir=workspace_dir)
except RuntimeError as e:
# Don't fallback if user cancelled
if cancel_event and cancel_event.is_set():
raise RuntimeError("Cancelled")
logger.warning("Primary claude failed (%s), trying fallback (claude-zai)", e)
# Fallback: claude-zai with separate session (using opus model)
try:
response = await _send_with_provider(
config, topic_id, message, on_chunk, on_question,
on_status=on_status, cancel_event=cancel_event,
idle_timeout_ref=idle_timeout_ref,
provider="zai", cmd_override="claude-zai", model_override="opus",
user_profile=user_profile, workspace_dir=workspace_dir,
)
# Add note that fallback provider was used
return response + "\n\n_[(via z.ai fallback)]_"
except RuntimeError:
raise RuntimeError("Both claude and claude-zai failed")
async def _watch_questions(topic_dir: Path, on_question: Callable) -> None:
"""Watch for ask-user.json and forward questions to the bot."""
question_file = topic_dir / "ask-user.json"
fifo_file = topic_dir / "ask-user.fifo"
while True:
await asyncio.sleep(0.5)
if not question_file.exists():
continue
try:
data = json.loads(question_file.read_text())
question = data.get("question", "")
logger.info("Claude asks user: %s", question[:200])
answer = await on_question(question)
# Write answer to FIFO (unblocks ask-user script)
with open(fifo_file, "w") as f:
f.write(answer)
question_file.unlink(missing_ok=True)
except Exception as e:
logger.error("Error handling ask-user: %s", e)
question_file.unlink(missing_ok=True)
def _tool_preview(tool_name: str, raw_input: str) -> str:
"""Extract a human-readable preview from tool input JSON."""
try:
inp = json.loads(raw_input)
except (json.JSONDecodeError, TypeError):
return raw_input[:200]
if tool_name == "Bash":
return inp.get("command", "")[:500]
if tool_name in ("Read", "Write"):
return inp.get("file_path", "")[:300]
if tool_name == "Edit":
return inp.get("file_path", "")[:300]
if tool_name in ("Glob", "Grep"):
return inp.get("pattern", "")[:200]
if tool_name == "WebSearch":
return inp.get("query", "")[:200]
if tool_name == "WebFetch":
return inp.get("url", "")[:300]
if tool_name == "Agent":
desc = inp.get("description", "")
prompt = inp.get("prompt", "")
return desc[:200] if desc else prompt[:300]
if tool_name == "TodoWrite":
todos = inp.get("todos", [])
if todos:
items = [t.get("content", "")[:80] for t in todos[:3]]
return "; ".join(items)
# Generic: show first key=value
for k, v in inp.items():
return f"{k}={str(v)[:200]}"
return ""
def _load_conversation_log(data_dir: Path, topic_id: str, limit: int = 5) -> str:
"""Load recent conversation log for context.
Returns formatted summary of last N interactions from log.jsonl,
so Claude has context even after session resets or fallback switches.
"""
log_file = data_dir / "rooms" / str(topic_id) / "log.jsonl"
if not log_file.exists():
return ""
try:
with open(log_file) as f:
entries = [json.loads(line.strip()) for line in f if line.strip()]
except Exception:
return ""
if not entries:
return ""
recent = entries[-limit:]
parts = []
for e in recent:
ts = e.get("ts", "")[:16].replace("T", " ")
user = e.get("user", "")[:300]
bot = e.get("bot", "")[:500]
parts.append(f"[{ts}] User: {user}")
parts.append(f"[{ts}] Bot: {bot}")
return "\n".join(parts)
async def _send_with_provider(
config: Config,
topic_id: int | str,
message: str,
on_chunk: Callable | None,
on_question: Callable | None,
on_status: Callable | None = None,
cancel_event: asyncio.Event | None = None,
idle_timeout_ref: list | None = None,
provider: str = "",
cmd_override: str | None = None,
model_override: str | None = None,
user_profile: str = "",
workspace_dir: Path | None = None,
_retry_count: int = 0,
) -> str:
"""Send message using a specific provider."""
existing_session = load_session(config.data_dir, topic_id, provider)
topic_dir = config.data_dir / "topics" / str(topic_id)
topic_dir.mkdir(parents=True, exist_ok=True)
cmd = cmd_override or config.claude_cmd
# Build args: --resume for existing sessions, --session-id for new ones
if existing_session:
session_flag = ["--resume", existing_session]
else:
new_id = str(uuid.uuid4())
session_flag = ["--session-id", new_id]
# User profile: prefer explicit parameter, fallback to workspace user.md
user_context = ""
if user_profile:
user_context = f"\n\nUSER PROFILE:\n{user_profile}\n"
elif config.workspace_dir:
user_md = config.workspace_dir / "user.md"
if user_md.exists():
user_context = f"\n\nUSER PROFILE:\n{user_md.read_text().strip()}\n"
# Load recent conversation log — provides context after session resets,
# fallback switches, or timeouts. Always included so Claude knows what happened.
conv_log = _load_conversation_log(config.data_dir, str(topic_id))
conv_context = ""
if conv_log:
conv_context = (
"\n\nRECENT CONVERSATION LOG (from bot's perspective, "
"may overlap with your session memory — use to fill gaps "
"after timeouts or session switches):\n" + conv_log + "\n"
)
# Per-user workspace context
workspace_context = ""
if workspace_dir and workspace_dir.is_dir():
ws_md = workspace_dir / "WORKSPACE.md"
if ws_md.exists():
workspace_context = (
f"\n\nUSER WORKSPACE ({workspace_dir}):\n"
f"{ws_md.read_text().strip()}\n"
f"\nYour working directory is the topic dir ({topic_dir}). "
f"Use it for scratch work (scripts, downloads, temp files). "
f"Save important/refined results to the workspace at {workspace_dir}. "
f"The workspace is a git repo — your changes will be committed automatically.\n"
)
# Paths Claude should know about
room_dir = config.data_dir / "rooms" / str(topic_id)
log_file = room_dir / "log.jsonl"
history_file = room_dir / "history.jsonl"
# System prompt with topic context
system_extra = (
f"Topic/room ID: {topic_id}. Data dir: {topic_dir}. "
f"After responding, update {config.data_dir / 'topic-map.yml'} "
f"with this topic's ID, path, and a short label. "
f"The bot renames the topic from the label. "
f"CONVERSATION HISTORY: Full conversation log is at {log_file} (JSONL, "
f"fields: ts, user, bot — every interaction with timestamps). "
f"Detailed message history with sender info: {history_file}. "
f"If you lose context (after timeout, session switch, or restart), "
f"READ these files to recover the full conversation. "
f"Entries ending with '[timed out]' or '[idle timeout]' mean your previous "
f"response was cut short — check what you were doing and continue. "
f"FORMATTING: User reads on mobile (Telegram/Matrix Element). "
f"NEVER use markdown tables — they render as broken text on mobile. "
f"Prefer bullet lists, bold headers, numbered lists to structure data. "
f"Small tables (2-4 cols, few rows): use monospace code block with aligned columns. "
f"Large/complex tables: generate HTML, convert to PDF via "
f"`html-to-pdf input.html output.pdf`, send via send-to-user. "
f"Do NOT use wkhtmltopdf — its PDFs are broken on iOS. "
f"SCREENSHOTS: `screenshot-page <url-or-file> output.png [--width 1280] [--height 900] "
f"[--wait 3] [--full-page] [--stealth]`. Works with URLs and local HTML files (folium maps etc). "
f"IMAGE SEARCH: `search-images \"query\" -o dir/ -n 4 -p prefix [--size large] "
f"[--orient horizontal]`. Uses Yandex Image Search API. Downloads images automatically. "
f"Add --no-download to just list URLs. "
f"WEB SEARCH: `search-web \"query\" [-n 10] [--lang ru]`. Yandex web search — "
f"best for Russian-language queries. Returns titles, URLs, snippets. "
f"Use for research, reviews, travel tips, local info. Lang: ru (default), en, tr. "
f"SENDING FILES: To send files to the user, use: `send-to-user <path> [caption]`. "
f"It is in PATH. The file will be delivered after your response. "
f"ASKING USER: To ask the user a question and wait for their reply, use: "
f"`ask-user \"your question\"`. It blocks until the user responds via the chat. "
f"IMAGE GENERATION: Use `generate-image` (NanoBanana/Gemini 3 Pro). "
f"It supports multi-turn chat for iterative refinement of images. "
f"First generation: `generate-image \"prompt\" output.png --chat history.json [-a 16:9]`. "
f"Refinement (edits the PREVIOUS image): `generate-image --chat history.json --refine \"change X to Y\" output2.png`. "
f"The --chat flag saves conversation context so the model remembers what it generated. "
f"ALWAYS use --chat with a history file in the current dir so you can refine later. "
f"The model can modify its own previous output when you use --refine — "
f"it does NOT generate from scratch, it edits the existing image. "
f"You can also pass reference images (up to 14): `generate-image \"prompt\" out.png --chat h.json --ref photo.jpg --ref photo2.jpg`. "
f"Aspect ratios: 9:16, 16:9, 1:1, 4:3, 3:4. Sizes: 1K, 2K, 4K (default). "
f"THREAD VISIBILITY: Your response is posted in a Matrix thread. "
f"The user sees ONLY the final message at a glance — intermediate tool output "
f"and thread messages are hidden unless expanded. "
f"All text the user needs to read MUST be in your response message, not only in files. "
f"Writing to files for persistence is fine, but the conversation text — "
f"analysis, notes, discussion points — must appear in the response itself. "
f"The user is chatting with you, not reading files. "
f"IMAGES IN CONTEXT: When conversation history contains entries like "
f"'[image: /path/to/file.png]', these are actual image files on disk. "
f"Use the Read tool to view them — they contain photos, screenshots, or book pages "
f"that the user shared. Always review referenced images before responding about them. "
f"TOOL DISCOVERY: Before installing packages or writing scripts, check what tools "
f"are already available. Common tools in PATH: transcribe-audio, send-to-user, "
f"ask-user, search-web, search-images, screenshot-page, generate-image, html-to-pdf, browser. "
f"BROWSER: If BROWSER_CDP_URL is set, you have access to a real Chrome browser via "
f"`browser <command>`. Commands: navigate <url>, screenshot [file], click <selector>, "
f"type <selector> <text>, read [selector], eval <js>, tabs, new [url], close. "
f"Use this for web interaction, authenticated sites, downloads, form filling. "
f"Run `ls /opt/agent-core/common-tools/` to see all. "
f"Prefer existing tools over writing new code."
f"{user_context}"
f"{workspace_context}"
f"{conv_context}"
)
claude_args = [
cmd,
*session_flag,
"-p",
"--verbose",
"--output-format", "stream-json",
"--append-system-prompt", system_extra,
"--allowedTools", ",".join(config.allowed_tools),
"--max-turns", "50",
]
if model_override:
claude_args.extend(["--model", model_override])
claude_args.append(message)
# Wrap with bwrap if available
bwrap_path = Path(__file__).resolve().parent.parent / "bwrap-claude"
if bwrap_path.exists() and shutil.which("bwrap"):
args = [str(bwrap_path)] + claude_args
else:
args = claude_args
# Build clean environment for Claude subprocess
_strip_prefixes = ("CLAUDECODE", "CLAUDE_CODE")
_strip_keys = {
"BOT_TOKEN", "MATRIX_ACCESS_TOKEN", "MATRIX_HOMESERVER",
"MATRIX_USER_ID", "MATRIX_OWNER_MXID", "MATRIX_DEVICE_ID",
}
# Auth env vars that must pass through to Claude CLI
_passthrough_keys = {"CLAUDE_CODE_OAUTH_TOKEN"}
env = {
k: v for k, v in os.environ.items()
if k in _passthrough_keys
or (not any(k.startswith(p) for p in _strip_prefixes) and k not in _strip_keys)
}
# Add common-tools to PATH so Claude can use send-to-user, generate-image, etc.
common_tools = str(Path(__file__).resolve().parent.parent / "common-tools")
env["PATH"] = common_tools + ":" + env.get("PATH", "")
# Load per-user workspace .env (Readest keys, Linkwarden keys, etc.)
if workspace_dir:
ws_env = workspace_dir / ".env"
if ws_env.exists():
for line in ws_env.read_text().splitlines():
line = line.strip()
if line and not line.startswith("#") and "=" in line:
key, _, val = line.partition("=")
env[key.strip()] = val.strip().strip("'\"") # handle KEY="value" and KEY='value'
session_label = existing_session[:8] if existing_session else f"new:{new_id[:8]}"
logger.info("Claude CLI: topic=%s session=%s cmd=%s", topic_id, session_label, cmd)
proc = await asyncio.create_subprocess_exec(
*args,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
cwd=str(topic_dir),
env=env,
limit=10 * 1024 * 1024, # 10MB — stream-json lines can be huge (base64 images)
)
response_parts: list[str] = []
full_text = ""
result_text = "" # clean final response from result event
result_session_id = None
timeout_reason = None
# Tool tracking for status events
block_tools: dict[str, str] = {} # tool_use_id -> tool name
# Idle timeout state — mutable so watchdog can read, user can extend
idle_timeout = idle_timeout_ref if idle_timeout_ref is not None else [config.claude_idle_timeout]
last_activity = [time.monotonic()]
start_time = time.monotonic()
# Start question watcher if callback provided
question_task = None
if on_question:
question_task = asyncio.create_task(_watch_questions(topic_dir, on_question))
# Watchdog: checks idle timeout, hard timeout, and cancel
async def _watchdog():
nonlocal timeout_reason
while True:
await asyncio.sleep(2)
now = time.monotonic()
if cancel_event and cancel_event.is_set():
timeout_reason = "cancelled"
proc.kill()
return
idle = now - last_activity[0]
if idle > idle_timeout[0]:
timeout_reason = "idle"
proc.kill()
return
elapsed = now - start_time
if elapsed > config.claude_max_timeout:
timeout_reason = "max"
proc.kill()
return
watchdog_task = asyncio.create_task(_watchdog())
# Stream log — save all events from Claude CLI for debugging/replay
stream_log_path = topic_dir / "stream.jsonl"
stream_log = open(stream_log_path, "a")
try:
async for line in proc.stdout:
last_activity[0] = time.monotonic() # reset idle timer on ANY output
line = line.decode("utf-8", errors="replace").strip()
if not line:
continue
# Log raw event to stream.jsonl
stream_log.write(line + "\n")
stream_log.flush()
try:
event = json.loads(line)
except json.JSONDecodeError:
logger.debug("Non-JSON stdout: %s", line[:200])
continue
etype = event.get("type")
# Capture session_id from init or result events
if etype == "system" and event.get("session_id"):
result_session_id = event["session_id"]
elif etype == "result" and event.get("session_id"):
result_session_id = event["session_id"]
# Handle result events — this has the clean final response
if etype == "result":
if event.get("is_error"):
errors = event.get("errors", [])
logger.error("Claude CLI error: %s", "; ".join(errors))
if event.get("result"):
result_text = event["result"]
# --- Status events from stream-json ---
# Claude CLI emits full "assistant" snapshots (with tool_use blocks)
# followed by "user" events (with tool_result).
if etype == "assistant":
content = event.get("message", {}).get("content", [])
has_tools = any(b.get("type") == "tool_use" for b in content)
for block in content:
if block.get("type") == "tool_use" and on_status:
tool_name = block.get("name", "")
tool_id = block.get("id", "")
inp = block.get("input", {})
preview = _tool_preview(tool_name, json.dumps(inp, ensure_ascii=False))
if tool_id:
block_tools[tool_id] = tool_name
if tool_name == "Agent":
desc = inp.get("description", "")
bg = inp.get("run_in_background", False)
await on_status({
"event": "agent_start",
"description": desc,
"background": bg,
})
else:
await on_status({
"event": "tool_start",
"tool": tool_name,
"input_preview": preview,
})
# All assistant text goes to thread as narration.
# Only result.result is the final clean response.
if block.get("type") == "text" and block.get("text"):
text = block["text"]
if on_status:
await on_status({
"event": "thinking",
"text": text,
})
# Also accumulate for on_chunk (Telegram streaming)
response_parts.append(text)
full_text = "".join(response_parts)
if on_chunk:
await on_chunk(full_text)
# Tool results mark tool completion
if etype == "user" and on_status:
content = event.get("message", {}).get("content", [])
if isinstance(content, list):
for block in content:
if isinstance(block, dict) and block.get("type") == "tool_result":
tool_id = block.get("tool_use_id", "")
tool_name = block_tools.pop(tool_id, "tool")
await on_status({"event": "tool_end", "tool": tool_name})
# Check if watchdog killed the process
if watchdog_task.done():
break
await proc.wait()
except Exception:
if not watchdog_task.done():
watchdog_task.cancel()
raise
finally:
stream_log.close()
if not watchdog_task.done():
watchdog_task.cancel()
try:
await watchdog_task
except asyncio.CancelledError:
pass
if question_task:
question_task.cancel()
try:
await question_task
except asyncio.CancelledError:
pass
elapsed = int(time.monotonic() - start_time)
# Handle timeout/cancel
if timeout_reason:
await proc.wait()
if timeout_reason == "cancelled":
logger.info("Claude CLI cancelled by user after %ds", elapsed)
suffix = "\n\n[cancelled by user]"
elif timeout_reason == "idle":
logger.warning("Claude CLI idle timeout after %ds (idle limit: %ds)", elapsed, idle_timeout[0])
suffix = f"\n\n[idle timeout — no output for {idle_timeout[0]}s]"
else:
logger.error("Claude CLI hard timeout after %ds (max: %ds)", elapsed, config.claude_max_timeout)
suffix = f"\n\n[timeout — {elapsed}s elapsed]"
# Save session even on timeout — don't lose conversation history
if result_session_id:
save_session(config.data_dir, topic_id, result_session_id, provider)
# On timeout: prefer result_text (clean), fall back to full_text (has thinking)
response = result_text or full_text
error_patterns = ["Failed to authenticate", "API Error:", "authentication_error", "401"]
if response and not any(p in response for p in error_patterns):
return response + suffix
raise RuntimeError(f"Claude CLI {timeout_reason} after {elapsed}s (error response: {full_text[:100]})")
# Save session ID for future resume
if result_session_id:
save_session(config.data_dir, topic_id, result_session_id, provider)
# Check for error responses (auth failures, API errors) - these should trigger fallback
error_patterns = ["Failed to authenticate", "API Error:", "authentication_error", "401"]
is_error_response = any(p in full_text for p in error_patterns)
if proc.returncode != 0 or is_error_response:
stderr = await proc.stderr.read()
stderr_text = stderr.decode("utf-8", errors="replace").strip()
logger.error("Claude CLI failed (rc=%d): %s", proc.returncode, stderr_text[:500])
if is_error_response:
raise RuntimeError(f"Claude CLI returned error: {full_text[:200]}")
response = result_text or full_text
if response:
return response
# Non-auth failure with no output — raise to trigger fallback
# but preserve session file (conversation history is valuable)
raise RuntimeError(f"Claude CLI exited with code {proc.returncode}")
response = result_text or full_text
if not response and _retry_count < 1:
logger.warning("Claude CLI returned empty response, retrying (attempt %d)", _retry_count + 1)
return await _send_with_provider(
config, topic_id, message, on_chunk, on_question,
on_status=on_status, cancel_event=cancel_event,
idle_timeout_ref=idle_timeout_ref,
provider=provider, cmd_override=cmd_override, model_override=model_override,
user_profile=user_profile, workspace_dir=workspace_dir,
_retry_count=_retry_count + 1,
)
return response or "(no response)"
def _extract_text(event: dict) -> str | None:
"""Extract text content from a stream-json event."""
etype = event.get("type")
if etype == "assistant":
content = event.get("message", {}).get("content", [])
texts = []
for block in content:
if block.get("type") == "text":
texts.append(block.get("text", ""))
return "".join(texts) if texts else None
if etype == "content_block_delta":
delta = event.get("delta", {})
if delta.get("type") == "text_delta":
return delta.get("text", "")
# Don't extract from "result" — it duplicates what was already
# streamed via "assistant" events. The caller uses it as fallback
# only if full_text is empty after processing all events.
return None

2667
bot-examples/matrix_bot_rooms.py Executable file

File diff suppressed because it is too large Load diff

123
bot-examples/matrix_main.py Normal file
View file

@ -0,0 +1,123 @@
"""Entry point for Matrix bot frontend."""
import asyncio
import logging
import os
import sys
from pathlib import Path
import httpx
import yaml
from core.config import Config
from core.matrix_bot import MatrixBot
def _load_dotenv(workspace: Path) -> None:
env_file = workspace / ".env"
if not env_file.exists():
return
for line in env_file.read_text().splitlines():
line = line.strip()
if not line or line.startswith("#") or "=" not in line:
continue
key, _, value = line.partition("=")
key = key.strip()
value = value.strip().strip('"').strip("'")
if key not in os.environ:
os.environ[key] = value
def _load_users(workspace: Path) -> dict[str, dict]:
"""Load users.yml from workspace. Returns {mxid: {profile: ...}}."""
users_file = workspace / "users.yml"
if not users_file.exists():
return {}
with open(users_file) as f:
data = yaml.safe_load(f) or {}
return data
async def main() -> None:
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s %(name)s %(levelname)s %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
workspace_dir = os.environ.get("WORKSPACE_DIR")
if workspace_dir:
_load_dotenv(Path(workspace_dir))
# MATRIX_DATA_DIR overrides DATA_DIR for Matrix bot
matrix_data_dir = os.environ.get("MATRIX_DATA_DIR")
if matrix_data_dir:
os.environ["DATA_DIR"] = matrix_data_dir
# Matrix-specific env vars
homeserver = os.environ.get("MATRIX_HOMESERVER")
user_id = os.environ.get("MATRIX_USER_ID")
access_token = os.environ.get("MATRIX_ACCESS_TOKEN")
owner_mxid = os.environ.get("MATRIX_OWNER_MXID", "")
admin_mxid = os.environ.get("MATRIX_ADMIN_MXID", "") # For admin notifications
if not all([homeserver, user_id, access_token]):
logging.error(
"Missing Matrix config. Need: MATRIX_HOMESERVER, MATRIX_USER_ID, "
"MATRIX_ACCESS_TOKEN"
)
sys.exit(1)
# Resolve device_id from server (must match access token)
async with httpx.AsyncClient() as http:
resp = await http.get(
f"{homeserver}/_matrix/client/v3/account/whoami",
headers={"Authorization": f"Bearer {access_token}"},
timeout=10,
)
if resp.status_code != 200:
logging.error("whoami failed (%d): %s", resp.status_code, resp.text)
sys.exit(1)
device_id = resp.json().get("device_id")
logging.info("Resolved device_id: %s", device_id)
# Load users map (multi-user mode)
users = {}
if workspace_dir:
users = _load_users(Path(workspace_dir))
if not users and not owner_mxid:
logging.error("Need either users.yml in workspace or MATRIX_OWNER_MXID env var")
sys.exit(1)
try:
config = Config.from_env()
except ValueError as e:
logging.error("Config error: %s", e)
sys.exit(1)
if config.workspace_dir:
logging.info("Workspace: %s", config.workspace_dir)
# Symlink workspace CLAUDE.md into data dir
claude_md_link = config.data_dir / "CLAUDE.md"
claude_md_src = config.workspace_dir / "CLAUDE.md"
if claude_md_src.exists() and not claude_md_link.exists():
claude_md_link.symlink_to(claude_md_src)
logging.info("Symlinked CLAUDE.md into data dir")
if users:
logging.info("Multi-user mode: %d users", len(users))
logging.info("Data dir: %s", config.data_dir)
bot = MatrixBot(config, homeserver, user_id, access_token,
owner_mxid=owner_mxid, users=users, device_id=device_id,
admin_mxid=admin_mxid)
try:
await bot.run()
except KeyboardInterrupt:
pass
finally:
await bot.close()
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,511 @@
"""Telegram bot engine.
Handles messages (text, photo, voice), topic management, and Claude CLI integration.
Uses RetryHTTPXRequest for proxy resilience, progressive message editing for streaming.
"""
import asyncio
import json
import logging
import time
from datetime import datetime, timezone
from pathlib import Path
import yaml
from telegram import BotCommand, Update
from telegram.constants import ChatAction, ParseMode
from telegram.error import BadRequest, NetworkError
from telegram.ext import (
Application,
CommandHandler,
ContextTypes,
MessageHandler,
filters,
)
from telegram.request import HTTPXRequest
from core.asr import transcribe
from core.claude_session import send_message as claude_send
from core.config import Config
logger = logging.getLogger(__name__)
# Streaming edit parameters
EDIT_INTERVAL = 1.5 # seconds between message edits
EDIT_MIN_DELTA = 150 # minimum new chars before editing
class RetryHTTPXRequest(HTTPXRequest):
"""HTTPXRequest with retry on ConnectError (SOCKS5 proxy hiccups)."""
MAX_RETRIES = 3
RETRY_DELAY = 2
async def do_request(self, *args, **kwargs):
last_exc = None
for attempt in range(self.MAX_RETRIES):
try:
return await super().do_request(*args, **kwargs)
except NetworkError as e:
if "ConnectError" in str(e):
last_exc = e
if attempt < self.MAX_RETRIES - 1:
logger.warning(
"Telegram ConnectError (attempt %d/%d), retrying in %ds...",
attempt + 1, self.MAX_RETRIES, self.RETRY_DELAY,
)
await asyncio.sleep(self.RETRY_DELAY)
else:
raise
raise last_exc
def build_app(config: Config) -> Application:
"""Build and configure the Telegram Application."""
builder = Application.builder().token(config.bot_token)
# Configure HTTP client with proxy and timeouts
request_kwargs = {
"connect_timeout": 30.0,
"read_timeout": 60.0,
"write_timeout": 60.0,
"pool_timeout": 10.0,
}
if config.proxy:
request_kwargs["proxy"] = config.proxy
request = RetryHTTPXRequest(**request_kwargs)
builder = builder.request(request)
builder = builder.concurrent_updates(True)
app = builder.build()
# Store config in bot_data for handler access
app.bot_data["config"] = config
# Register handlers (order matters — more specific first)
app.add_handler(CommandHandler("start", handle_start))
app.add_handler(CommandHandler("newtopic", handle_new_topic))
app.add_handler(MessageHandler(filters.PHOTO, handle_photo))
app.add_handler(MessageHandler(filters.VOICE | filters.AUDIO, handle_voice))
app.add_handler(MessageHandler(filters.Document.ALL, handle_document))
app.add_handler(MessageHandler(filters.TEXT & ~filters.COMMAND, handle_message))
# Post-init: set bot commands
app.post_init = _post_init
return app
async def _post_init(application: Application) -> None:
"""Set bot commands menu after initialization."""
commands = [
BotCommand("newtopic", "Create a new topic"),
BotCommand("start", "Start / help"),
]
await application.bot.set_my_commands(commands)
logger.info("Bot initialized: @%s", application.bot.username)
def _get_config(context: ContextTypes.DEFAULT_TYPE) -> Config:
return context.bot_data["config"]
def _is_owner(update: Update, config: Config) -> bool:
return update.effective_user and update.effective_user.id == config.owner_id
def _topic_id(update: Update) -> str:
"""Get topic ID from message, or 'general' for the default topic."""
thread_id = update.effective_message.message_thread_id
return str(thread_id) if thread_id else "general"
def _topic_dir(config: Config, topic_id: str) -> Path:
"""Get data directory for a topic."""
d = config.data_dir / "topics" / topic_id
d.mkdir(parents=True, exist_ok=True)
return d
def _log_interaction(config: Config, topic_id: str, user_msg: str, bot_msg: str) -> None:
"""Append interaction to topic log."""
log_file = _topic_dir(config, topic_id) / "log.jsonl"
entry = {
"ts": datetime.now(timezone.utc).isoformat(),
"user": user_msg[:1000],
"bot": bot_msg[:2000],
}
with open(log_file, "a") as f:
f.write(json.dumps(entry, ensure_ascii=False) + "\n")
def _md_to_html(text: str) -> str:
"""Convert common Markdown to Telegram HTML."""
import re
# Escape HTML entities first (but preserve our conversions)
text = text.replace("&", "&amp;").replace("<", "&lt;").replace(">", "&gt;")
# Code blocks: ```lang\n...\n```
text = re.sub(
r"```\w*\n(.*?)```",
lambda m: f"<pre>{m.group(1)}</pre>",
text, flags=re.DOTALL,
)
# Inline code: `...`
text = re.sub(r"`([^`]+)`", r"<code>\1</code>", text)
# Bold: **...**
text = re.sub(r"\*\*(.+?)\*\*", r"<b>\1</b>", text)
# Italic: *...*
text = re.sub(r"\*(.+?)\*", r"<i>\1</i>", text)
# Headers: ## ... → bold line
text = re.sub(r"^#{1,6}\s+(.+)$", r"<b>\1</b>", text, flags=re.MULTILINE)
# Bullet lists: - item → bullet
text = re.sub(r"^- ", "", text, flags=re.MULTILINE)
return text
async def _edit_text_md(message, text: str) -> None:
"""Edit message with HTML formatting, falling back to plain text."""
try:
html = _md_to_html(text)
await message.edit_text(html, parse_mode=ParseMode.HTML)
except BadRequest:
try:
await message.edit_text(text)
except BadRequest:
pass
# Cache of topic labels we've already applied: {topic_id: label}
_applied_labels: dict[str, str] = {}
# Pending questions from Claude: {topic_id: asyncio.Future}
_pending_questions: dict[str, asyncio.Future] = {}
async def _sync_topic_name(update: Update, config: Config, topic_id: str) -> None:
"""Rename Telegram topic if topic-map.yml has a new/changed label."""
if topic_id == "general":
return
topic_map_path = config.data_dir / "topic-map.yml"
if not topic_map_path.exists():
return
try:
with open(topic_map_path) as f:
topic_map = yaml.safe_load(f) or {}
entry = topic_map.get(topic_id) or topic_map.get(int(topic_id))
if not entry or not isinstance(entry, dict):
return
label = entry.get("label")
if not label or _applied_labels.get(topic_id) == label:
return
await update.get_bot().edit_forum_topic(
chat_id=update.effective_chat.id,
message_thread_id=int(topic_id),
name=label[:128],
)
_applied_labels[topic_id] = label
logger.info("Renamed topic %s to: %s", topic_id, label)
except BadRequest as e:
if "not modified" not in str(e).lower():
logger.warning("Failed to rename topic %s: %s", topic_id, e)
_applied_labels[topic_id] = label # don't retry
except Exception as e:
logger.warning("Error reading topic-map.yml: %s", e)
async def handle_start(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle /start command."""
config = _get_config(context)
if not _is_owner(update, config):
return
await update.effective_message.reply_text(
"Ready. Send me a message or use /newtopic to create a topic."
)
async def handle_new_topic(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle /newtopic <name> — create a forum topic."""
config = _get_config(context)
if not _is_owner(update, config):
return
name = " ".join(context.args) if context.args else None
if not name:
await update.effective_message.reply_text("Usage: /newtopic Topic Name")
return
try:
topic = await context.bot.create_forum_topic(
chat_id=update.effective_chat.id,
name=name,
)
tid = str(topic.message_thread_id)
_topic_dir(config, tid)
await context.bot.send_message(
chat_id=update.effective_chat.id,
message_thread_id=topic.message_thread_id,
text=f"Topic created. Send me anything here.",
)
logger.info("Created topic: %s (id=%s)", name, tid)
except BadRequest as e:
logger.error("Failed to create topic: %s", e)
await update.effective_message.reply_text(f"Failed to create topic: {e}")
async def handle_message(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle text messages — send to Claude CLI."""
config = _get_config(context)
if not _is_owner(update, config):
return
tid = _topic_id(update)
user_text = update.effective_message.text
# If Claude is waiting for an answer in this topic, deliver it
if tid in _pending_questions:
future = _pending_questions.pop(tid)
if not future.done():
future.set_result(user_text)
return
# Send typing indicator and placeholder
await context.bot.send_chat_action(
chat_id=update.effective_chat.id,
action=ChatAction.TYPING,
message_thread_id=update.effective_message.message_thread_id,
)
placeholder = await update.effective_message.reply_text("thinking...")
# Streaming state
last_edit_time = 0.0
last_edit_len = 0
async def on_chunk(text_so_far: str):
nonlocal last_edit_time, last_edit_len
now = time.monotonic()
delta = len(text_so_far) - last_edit_len
if delta >= EDIT_MIN_DELTA and (now - last_edit_time) >= EDIT_INTERVAL:
try:
display = _truncate_for_telegram(text_so_far)
await placeholder.edit_text(display)
last_edit_time = now
last_edit_len = len(text_so_far)
except BadRequest:
pass # message not modified or too long
async def on_question(question: str) -> str:
"""Claude asks user a question — send it and wait for reply."""
await update.effective_message.reply_text(f"{question}")
loop = asyncio.get_event_loop()
future = loop.create_future()
_pending_questions[tid] = future
return await future
topic_dir = _topic_dir(config, tid)
try:
response = await claude_send(
config, tid, user_text, on_chunk=on_chunk, on_question=on_question,
)
display = _truncate_for_telegram(response)
await _edit_text_md(placeholder, display)
except RuntimeError as e:
logger.error("Claude error for topic %s: %s", tid, e)
await placeholder.edit_text(f"Error: {e}")
response = f"[error] {e}"
finally:
_pending_questions.pop(tid, None)
await _send_outbox(update, topic_dir)
_log_interaction(config, tid, user_text, response)
await _sync_topic_name(update, config, tid)
async def handle_photo(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle photo messages — save image, send path to Claude."""
config = _get_config(context)
if not _is_owner(update, config):
return
tid = _topic_id(update)
images_dir = _topic_dir(config, tid) / "images"
images_dir.mkdir(exist_ok=True)
# Download the largest photo
photo = update.effective_message.photo[-1]
file = await context.bot.get_file(photo.file_id)
ts = datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S")
filename = f"{ts}_{photo.file_unique_id}.jpg"
filepath = images_dir / filename
await file.download_to_drive(str(filepath))
caption = update.effective_message.caption or ""
message = f"User sent an image: {filepath}"
if caption:
message += f"\nCaption: {caption}"
# Send typing and placeholder
placeholder = await update.effective_message.reply_text("looking at image...")
try:
response = await claude_send(config, tid, message)
display = _truncate_for_telegram(response)
await _edit_text_md(placeholder, display)
except RuntimeError as e:
logger.error("Claude error for photo in topic %s: %s", tid, e)
await placeholder.edit_text(f"Error: {e}")
response = f"[error] {e}"
_log_interaction(config, tid, f"[photo] {caption}", response)
await _sync_topic_name(update, config, tid)
async def handle_document(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle document messages — save file, send path to Claude."""
config = _get_config(context)
if not _is_owner(update, config):
return
tid = _topic_id(update)
docs_dir = _topic_dir(config, tid) / "documents"
docs_dir.mkdir(exist_ok=True)
doc = update.effective_message.document
file = await context.bot.get_file(doc.file_id)
# Use original filename if available, otherwise generate one
orig_name = doc.file_name or f"{doc.file_unique_id}"
ts = datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S")
filename = f"{ts}_{orig_name}"
filepath = docs_dir / filename
await file.download_to_drive(str(filepath))
caption = update.effective_message.caption or ""
message = f"User sent a document: {filepath} (name: {orig_name}, size: {doc.file_size} bytes)"
if caption:
message += f"\nCaption: {caption}"
topic_dir = _topic_dir(config, tid)
placeholder = await update.effective_message.reply_text("reading document...")
try:
response = await claude_send(config, tid, message)
display = _truncate_for_telegram(response)
await _edit_text_md(placeholder, display)
except RuntimeError as e:
logger.error("Claude error for document in topic %s: %s", tid, e)
await placeholder.edit_text(f"Error: {e}")
response = f"[error] {e}"
await _send_outbox(update, topic_dir)
_log_interaction(config, tid, f"[document: {orig_name}] {caption}", response)
await _sync_topic_name(update, config, tid)
async def handle_voice(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle voice/audio messages — save file, send path to Claude."""
config = _get_config(context)
if not _is_owner(update, config):
return
tid = _topic_id(update)
voice_dir = _topic_dir(config, tid) / "voice"
voice_dir.mkdir(exist_ok=True)
# Download voice file
voice = update.effective_message.voice or update.effective_message.audio
file = await context.bot.get_file(voice.file_id)
ext = "ogg" if update.effective_message.voice else "mp3"
ts = datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S")
filename = f"{ts}_{voice.file_unique_id}.{ext}"
filepath = voice_dir / filename
await file.download_to_drive(str(filepath))
topic_dir = _topic_dir(config, tid)
# Transcribe via Whisper if available, otherwise send file path
if config.whisper_url:
placeholder = await update.effective_message.reply_text("transcribing voice...")
try:
text = await transcribe(str(filepath), config.whisper_url)
message = f"[voice message transcription]: {text}"
logger.info("Transcribed voice in topic %s: %d chars", tid, len(text))
# Show transcription to user, then send to Claude
try:
await placeholder.edit_text(f"🎤 {text}")
except BadRequest:
pass
placeholder = await update.effective_message.reply_text("thinking...")
except RuntimeError as e:
logger.error("ASR failed for topic %s: %s", tid, e)
message = f"User sent a voice message: {filepath} (duration: {voice.duration}s)\n(transcription failed: {e})"
else:
message = f"User sent a voice message: {filepath} (duration: {voice.duration}s)"
placeholder = await update.effective_message.reply_text("processing voice...")
try:
response = await claude_send(config, tid, message)
display = _truncate_for_telegram(response)
await _edit_text_md(placeholder, display)
except RuntimeError as e:
logger.error("Claude error for voice in topic %s: %s", tid, e)
await placeholder.edit_text(f"Error: {e}")
response = f"[error] {e}"
await _send_outbox(update, topic_dir)
_log_interaction(config, tid, message, response)
await _sync_topic_name(update, config, tid)
async def _send_outbox(update: Update, topic_dir: Path) -> None:
"""Send files queued in outbox.jsonl by Claude via send-to-user tool."""
outbox = topic_dir / "outbox.jsonl"
if not outbox.exists():
return
entries = []
try:
with open(outbox) as f:
for line in f:
line = line.strip()
if line:
entries.append(json.loads(line))
# Clear outbox
outbox.unlink()
except Exception as e:
logger.error("Failed to read outbox: %s", e)
return
for entry in entries:
fpath = Path(entry.get("path", ""))
ftype = entry.get("type", "document")
caption = entry.get("caption", "") or fpath.name
if not fpath.is_file():
logger.warning("Outbox file not found: %s", fpath)
continue
try:
with open(fpath, "rb") as f:
if ftype == "image":
await update.effective_message.reply_photo(photo=f, caption=caption)
elif ftype == "video":
await update.effective_message.reply_video(video=f, caption=caption)
elif ftype == "audio":
await update.effective_message.reply_voice(voice=f, caption=caption)
else:
await update.effective_message.reply_document(document=f, caption=caption)
logger.info("Sent %s: %s", ftype, fpath.name)
except Exception as e:
logger.error("Failed to send %s %s: %s", ftype, fpath.name, e)
def _truncate_for_telegram(text: str, max_len: int = 4096) -> str:
"""Truncate text to Telegram message limit."""
if len(text) <= max_len:
return text
return text[: max_len - 20] + "\n\n[truncated]"

View file

@ -0,0 +1,75 @@
"""Entry point for agent-core bot.
Loads config from environment, optionally reads .env from workspace,
builds and runs the Telegram bot.
"""
import logging
import sys
from pathlib import Path
from core.bot import build_app
from core.config import Config
def _load_dotenv(workspace_dir: Path | None) -> None:
"""Load .env file from workspace directory if it exists."""
if not workspace_dir:
return
env_file = workspace_dir / ".env"
if not env_file.exists():
return
import os
for line in env_file.read_text().splitlines():
line = line.strip()
if not line or line.startswith("#"):
continue
if "=" not in line:
continue
key, _, value = line.partition("=")
key = key.strip()
value = value.strip().strip('"').strip("'")
# Don't override existing env vars
if key not in os.environ:
os.environ[key] = value
def main() -> None:
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s %(name)s %(levelname)s %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
import os
workspace_dir = os.environ.get("WORKSPACE_DIR")
if workspace_dir:
_load_dotenv(Path(workspace_dir))
try:
config = Config.from_env()
except ValueError as e:
logging.error("Config error: %s", e)
sys.exit(1)
if config.workspace_dir:
logging.info("Workspace: %s", config.workspace_dir)
# Symlink workspace CLAUDE.md into data dir so Claude CLI finds it
# when running in topic subdirectories
claude_md_link = config.data_dir / "CLAUDE.md"
claude_md_src = config.workspace_dir / "CLAUDE.md"
if claude_md_src.exists() and not claude_md_link.exists():
claude_md_link.symlink_to(claude_md_src)
logging.info("Symlinked CLAUDE.md into data dir")
logging.info("Data dir: %s", config.data_dir)
app = build_app(config)
app.run_polling(
allowed_updates=["message", "edited_message"],
stop_signals=None,
)
if __name__ == "__main__":
main()